REPORT  DOCUMENTATION  PAGE 


AD-A255  499 


PuBte  ttooning  Dwdan  loc  tM  caWciion  at  nionnuian  ■  MumoiM  to  avwaga  I  hou>  iwi  ratponta.  :«•.  gauwnng  and 

nuHiUmng  in*  data  naaatd.  ind  ooirplaUng  and  l•vlatlnng  IDa  conacnon  ol  MonnaUon.  Sand  con  Ion  ol  Mocmaiian. 

includuig  tuggaaMra  lot  laduong  inlt  butoon. to  Waaninglon HaaiMuanara  Satv*.cat.  Otractwala lot - -  „»•>•••>«■•  anu  napona.  uio  jonotaon OavH  Htgimay.  Sola  1204.  Ailngtoti. 

VA  22202-4302.  and  to  ma  Ollica  ol  Managamara  and  Budgol.  PaDantoik  Raduclion  Ptotad  (07O4-OIB8).  Waahatgton.  DC  20S03. 


1.  AGENCY  USE  ONLY  (Ltav  blank) 


2.  REPORT  DATE 

August  1992 


4.  TITLE  AND  SUBTITLE 

Fault-Tolerant  Wait-Free  Shared  Objects 


3.  REPORT  TYPE  AND  DATES  COVERED 

Special  Technical 


S.  FUNDING  NUMBERS 

NAG2-593 


6.  AUTHOR(S) 


Prasad  Jayanti,  Tushar  Deepak  Chandra, 
Sam  Toueg 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  AOORESS(ES) 

Sam  Toueg,  Associate  Professor 
Department  of  Computer  Science 
Cornell  University 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


92-1298 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADORESS(ES) 

DARPA/ISTO 


0.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 


13.  ABSTRACT  ^Afjjrj>num  200  tvordsl 

Please  see  page  1. 


1 7.  SECURITY  CLASSIFICATION  18.  SECURITY  CLASSIFICATION 

OF  REPORT  OF  THIS  PAGE 

UNCLASSIFIED  UNCLASSIFIED 


15.  NUMBER  OF  PAGES 

58 


16.  PRICE  CODE 


19.  SECURITY  CLASSIFICATION  20.  LIMITATION  OF 
OF  ABSTRACT  ABSTRACT 

UNCLASSIFIED  UNLIMITED 


MSN  754l>d1  2a0-S900 


Standard  Form  2M  (nov.  2  ra 
PraMttMd  D)r  ANSI  Sid.  Z3S-IS 

2M-ta2 


Fault-Tolerant  Wait-Free 
Shared  Objects** 

f 

Prasad  Jayanti 
Tushar  Deepak  Chandra* 

Sam  Toueg 

TR  92-1298 

(Revision  of  TR  92-1 281 ,  April  1 992) 
August  1 992 


Department  of  Computer  Science 
Cornell  University 
Ithaca,  NY  14853-7501 


**A  preliminary  version  of  this  will  appear  in  the  proceedings  of  the  33rd  Annual 
Symposium  on  Foundations  of  Computer  Science,  October  1992. 

‘‘Research  supported  by  NSF  grants  CCR-8901780  and  CCR-91 02231, 
DARPA/NASA  Ames  grant  NAG  2-593  and  grants  form  the  IBM  Endicott 
Programming  Laboratory. 

‘Also  supported  by  an  IBM  graduate  fellowship. 


Fault-tolerant  Wait-free  Shared  Objects*^ 

Prasad  Jayanti  Tushar  DeepaJc  Chandra^  Sam  Toueg 

{prasad,  chandra,  sam}@cs. cor nell.edu 
Department  of  Computer  Science 
Cornell  University 
Ithaca,  New  York  14853 

August  21,  1992 


Abstract 

A  concurrent  system  consists  of  processes  and  shared  objects.  Previous  research 
focused  on  the  problem  of  tolerating  process  failures.  We  study  the  complementary 
problem  of  tolerating  object  failures. 

We  divide  object  failures  into  two  broad  classes:  responsive  and  non-responsive. 

With  responsive  failures,  a  faulty  object  responds  to  every  invocation,  but  responses 
may  be  incorrect.  With  non-responsive  failures,  a  faulty  object  may  also  “hang”  without 
responding.  For  each  class,  we  consider  crash,  omission,  and  arbitrary  types  of  failures. 

For  each  type  of  failure,  we  are  seeking  a  universal  implementation  for  fault- tolerant 
wait-free  shared  objects.  We  present  (deterministic)  implementations  for  all  types  of 
responsive  failures,  including  arbitrary  failures.  In  contrast,  we  show  that  even  the  most 
benign  type  of  non-responsive  failures  requires  the  use  of  randomization. 

Of  special  interest  is  the  problem  of  implementing  fault-tolerant  objects  using  only 
objects  of  the  same  type.  We  present  such  fault-tolerant  self-implementations  for  memy 
common  object  types. 

Graceful  degradation  is  a  desirable  property  of  fault-tolerant  implementations:  the 
implemented  object  never  fails  more  severely  than  the  base  objects  it  is  derived  from, 
even  if  all  the  base  objects  fail.  For  several  failure  models,  we  show  whether  this 
property  can  be  achieved,  and,  if  so,  how. 

In  addition  to  the  above  possibility/impossibility  results,  we  also  consider  the  re¬ 
source  complexity  of  fault-tolerant  implementations.  In  many  cases,  we  present  lower 
bounds  cind  give  matching  algorithms. 

‘  A  preliminary  version  of  this  will  appear  in  the  proceedings  of  the  33rd  Annual  Symposium  on  Founda¬ 
tions  of  Computer  Science,  October,  1992. 

^Research  supported  by  NSF  grants  CCR-8901780  and  CCR-9102231,  DARPA/NASA  Ames  grant  NAG- 
2-593,  grants  from  the  IBM  Endicott  Programming  Laboratory. 

*Also  supported  by  an  IBM  graduate  fellowship. 


1  Introduction 


1.1  Background  and  motivation 

A  concurrent  system  consists  of  processes  commiinicating  via  shared  objects.  Examples 
of  shared  object  types  include  data  structures  such  as  read/write  register,  queue,  and 
set,  aind  synchronization  primitives  such  as  test&set,  fetch&add,  and  compare&swap. 
Even  though  different  processes  may  concurrently  access  a  shared  object,  the  object  must 
behave  as  if  all  these  accesses  occur  in  some  sequential  order.  More  precisely,  the  behavior 
of  a  shared  object  must  be  linearizable  [HW90|.  One  way  to  ensure  hnearizabihty  is  to 
implement  shared  objects  using  critical  sections  [CHP71].  This  approach,  however,  is  not 
fault-tolerant:  The  crash  of  a  process  while  in  the  critical  section  of  a  shared  object  caji 
permanently  prevent  the  rest  of  the  processes  from  accessing  that  object.  This  lack  of  fault- 
tolerance  led  to  the  concept  of  wait-free  implementations  of  shared  objects.  Informally,  a 
shared  object  is  wait-free  if  every  operation  invocation  on  that  object  by  every  process  is 
guaranteed  a  response  in  finite  time  irrespective  of  the  speed  of  the  other  processes,  even  if 
some  or  aU  other  processes  in  the  system  cr^lsh. 

Thus,  a  concurrent  system  in  which  all  shared  objects  are  wait-free  is  resUient  to  pro¬ 
cess  crashes.  However,  such  a  system  is  not  resihent  to  the  failures  of  the  shared  objects 
themselves.^  For  example,  the  “crash”  of  a  single  shared  object  stops  all  the  processes  that 
need  to  access  that  object.  Motivated  by  this  observation,  we  study  the  problem  of  imple¬ 
menting  wait-free  sheired  objects  that  are  also  fault-tolerant.  With  such  objects,  the  system 
is  guaranteed  to  make  progress  despite  process  crashes  and  the  f2iilures  of  some  underlying 
objects.  (To  simphfy  notation,  hereafter  “object”  denotes  a  “shared  object”.) 

The  problem  addressed  in  this  paper  is  novel.  A  prehminary  version  appeared  in 
[JCT92a],  and  a  summary  of  the  results  in  [JCT92b].  An  independent  work  by  Afek, 
Greenberg,  Merritt,  and  Taubenfeld  [AGMT92]  heis  the  same  general  goal,  but  differs  in 
many  respects.  We  present  a  brief  comparison  of  the  two  works  in  Section  8. 

1.2  Object  failures 

We  divide  object  failures  into  two  broad  claisses;  responsive  and  non-responsive.  With 
responsive  failures,  a  faulty  object  responds  to  every  invocation,  but  responses  may  be  in¬ 
correct.  With  non-responsive  failures,  a  faulty  object  may  also  “haing”  without  responding. 

We  divide  responsive  fzulures  into  three  models:  R-crash,  R-omission^  and  R-arbitrary. 
An  object  that  fails  by  R-crash  behaves  correctly  until  it  fails,  and  once  it  faiils,  it  returns 
a  distinguished  response  J.  to  every  operation.  As  with  R-creish,  an  object  that  fails  by 
R-omission  may  return  a  correct  response  or  a  J..  However,  even  if  it  responds  J.  to  a 
process  p,  a  subsequent  operation  by  a  different  process  q  may  get  a  correct  response. 
This  behavior  models  2in  object  O  made  of  several  components,  some  of  which  failed.  The 

'Even  “software”  objects  have  underlying  hardware  components.  The  software  and/or  the  hardware 
c&uld  be  faulty. 
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operation  by  p  “ran  into”  a  failed  component  of  O  (and  returned  ±),  while  the  later  one 
by  q  only  encountered  correct  components  of  O  (and  returned  a  correct  response).  Finally, 
objects  experiencing  R-Mbitreiry  failures  may  “lie”,  i.e.,  return  arbitrary  responses. 

Similarly,  we  divide  non-responsivp  failures  into  crash,  omission,  and  arbitrary.  An 
object  that  fails  by  crjish  behaves  correctly  until  it  fails,  and  once  it  fails,  it  stops  responding. 
An  object  that  fails  by  omission  may  fail  to  respond  to  the  invocations  of  an  arbitrary  subset 
of  processes,  but  continue  to  respond  to  the  invocations  of  the  remaining  processes  (forever). 
The  behavior  of  an  object  that  experiences  an  arbitrary  failure  is  completely  unrestricted: 
it  may  not  respond,  and  even  if  it  does,  the  response  may  be  arbitrary. 

1.3  Fault-tolerant  objects 

Let  T  be  ein  object  type  and  £  =  (Ti,  T2, . . . ,  T„)  be  a  hst  of  object  types  (Ti’s  are  not  neces¬ 
sarily  distinct).  A  wait-free  implementation  of  T  from  £  is  a  function  I  such  that  given  any 
distinct  objects  0i,02,  -  ^On  of  type  Ti,T2,.. .  ,T„,  respectively,  O  -  X(0i,02,  ■  ■  •  ,On)  is 
^ln  object  of  type  T  that  behaves  correctly  if  all  Oj’s  behave  correctly.  Roughly  speaking,  an 
object  behaves  correctly  if  it  is  wait-free  amd  its  behavior  is  consistent  with  its  type.  We  say 
C?  is  a  derived  object  of  the  implementation  X,  and  Oi,  O2, . . . ,  On  are  the  base  objects  of  O. 
The  resource  complexity  of  X  is  n,  the  number  of  base  objects  required  by  X  to  implement 
a  derived  object.  Such  a  wait-free  implementation  X  is  t-tolerant  for  failure  model  M.  if  O 
behaves  correctly  even  if  at  most  t  base  objects  of  O  fail  by  M.  In  this  Introduction,  we 
write  “implementation”  as  a  shorthand  for  “wait-free  implementation”. 

J  is  a  self-implementation  if  Ti  =  T2  =  . . .  =  Tn  =  T.  In  other  words,  in  a  self- 
implementation  the  base  objects  are  of  the  same  type  as  the  derived  object.  For  example, 
consider  the  object  type  “2-process  queue”  (i.e.,  a  queue  that  can  be  accessed  by  at  most 
two  processes).  In  Section  5.3,  we  show  that  there  is  a  t-toleremt  self-implementation  of 
2-process  queue  for  R-au:bitrary  fjulures.  Intuitively,  this  means  that  using  a  set  of  w^ut-free 
2-process  queues,  at  most  t  of  which  may  experience  R-arbitrary  failures,  one  can  implement 
a  failure-free  wait-free  2-process  queue.  Thus  in  a  self-implementation  fault-tolerance  is 
achieved  through  replication. 

1.4  Results 

To  study  whether  a  generjil  object  type  has  a  t-tolerant  implementation,  ve  focus  on  two 
particular  object  types:  consensus^  and  register.  Herhhy  [Her91]  and  Plotkin  [Plo89] 
showed  that  one  cau  implement  a  wait-free  object  of  any  type  (for  which  a  sequential  im¬ 
plementation  exists)  using  only  consensus  and  register  objects.  Thus,  if  consensus  and 
register  have  t-tolerant  implementations,  then  every  object  ty^.e  haw  a  t-tolerant  imple¬ 
mentation. 

consensiis  object  snppotts  two  operations  propose  0  and  propose  1,  and  hats  the  following  sequential 
specification:  If  the  first  operation  on  the  object  is  propose  v  {v  £  {(1,1}),  then  every  operation  is  returned 
the  response  u. 
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We  first  study  the  problem  of  tolerating  responsive  failures.  We  give  t-tolerant  self- 
implementations  of  consensus  for  R-crash,  R -omission,  and  R-arbitrary  fciilures.  For 
R-crash  and  R-omission  failures,  our  self-implementation  is  optimeil  requiring  only  t  -I-  1 
beise  consensus  objects.  For  R-arbitrary  failures,  our  self-implementation  is  efficient  re¬ 
quiring  O(tlogf)  base  consensus  objects.  We  also  give  t-tolerant  self-implementations  of 
register  for  R-crash,  R-omission,  and  R-arbitrary  failures.  Combining  the  above  results 
with  [Her91,  Plo89],  we  conclude  that  every  object  type  T  has  a  t-tolerant  implementa¬ 
tion  (from  consensus  and  register)  for  all  responsive  models  of  failures.  Moreover,  if  T 
implements  consensus  and  register,  then  T  has  a  t-tolerant  se//-implementation.  This 
implies  that  familiar  object  types  such  as  (2-process)  fetchfcadd,  queue,  stack,  test&set. 
and  (iV-process)  compare&swap,  move,  swap  have  t-tolerant  self-implementations  even  for 
R-2irbitrary  failures! 

What  about  tolerating  non-responsive  failures?  We  first  show  that  there  is  no  1- 
tolerant  implementation  of  consensus  even  for  crash  failures,  the  most  benign  of  the  non- 
responsive  models  of  failures.^  This  immediately  implies  that  any  object  type  T  that  imple¬ 
ments  consensus  such  as  fetchjtadd,  queue,  stack,  testftset,  comp2ure&swap,  move, 
sticky-bit,  swap,  has  no  1-tolerant  implementation  for  crash  failures.  In  contrcist,  we 
show  that  register  hjis  a  t-tolerant  se// -implementation  even  for  arbitrary  failures.  Since 
randomized  implementations  of  consensus  from  register  are  well  known  (for  example, 
see  [Asp90]),  the  above  result  implies  that  every  object  type  heis  a  randomized  t-tolerant 
implementation  from  register  even  for  arbitrary  failures.  In  addition  to  these  universaUty 
and  impossibility  results,  this  paper  contains  the  following  results. 

Consider  a  t-tolerant  implementation  for  failure  model  M.  By  definition,  a  derived 
object  of  this  implementation  is  guaranteed  to  behave  correctly  even  if  up  to  t  base  objects 
faiil  by  M.  But  what  happens  if  more  them  t  base  objects  fail?  In  general,  the  derived 
object  may  experience  a  more  severe  failure  than  M..  In  other  words,  implementations 
may  “amplify”  failures:  derived  objects  may  fail  more  severely  than  base  objects.  This 
undesirable  behavior  is  prevented  by  implementations  that  are  “gracefully  degrading” .  An 
implementation  is  gracefully  degrading  for  failure  model  M  if  it  heis  the  following  property: 
if  base  objects  only  feiil  by  M,  then  derived  objects  also  M  by  M.. 

From  a  1-tolerant  gracefully  degrading  self-implementation  of  any  object  type  T  for  a 
failure  model  M.,  we  show  how  to  recursively  construct  a  t-tolercint  gracefully  degrading  self¬ 
implementation  of  T  for  M.  Thus,  graceful  degradation  provides  a  method  for  automatically 
increasing  the  fault-tolerance  of  an  implementation. 

Requiring  graceful  degradation  may  increase  the  cost  of  an  implementation.  For  in¬ 
stance,  consider  t-tolerant  implementations  of  consensus  for  R-omission  failures.  We 
present  two  such  implementations.  One  uses  only  t  -l-  1  base  objects,  but  is  not  grace¬ 
fully  degr^lding.  The  other  is  gracefully  degrading,  but  requires  2t  +  1  base  objects.  In 
fact,  we  show  that  graceful  degradation  for  R-omission  failures  requires  at  least  2t  -I- 1  base 

’The  impossibility  of  implementisg  a  faolt-toleiaat  consensus  object  from  any  finite  list  of  base  objects, 
one  of  which  may  crash,  is  shown  nsing  the  impossibility  of  solving  the  consensus  problem  among  a  finite 
number  of  processes,  one  of  which  may  crash  [FLP85,  LAA87j. 
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objects  (this  lower  bound  holds  for  every  deterministic  non-trivi2il  type). 

In  some  cases,  graceful  degradation  cannot  be  even  achieved.  In  particular,  we  show 
that  there  is  a  large  cla.ss  of  object  types  that  have  no  gracefully  degrading  implementations 
for  R-crash.  Intuitively,  this  means  that  whatever  the  implementation,  the  failure  of  the 
implemented  object  will  be  more  severe  than  R-crash,  even  if  all  its  base  objects  can  only 
fail  by  R-crzish.  In  other  words,  with  R-crash,  implementations  necessarily  amplify  feiilures. 
In  contrast,  we  prove  the  following  strong  possibility  result  for  R-omission:  Every  object 
type  has  a  t-tolerant  gracefully  degrading  implementation  from  consensus  and  register 
for  R-omission. 

We  study  the  problem  of  translating  severe  failures  into  more  benign  failures  [NT90]. 
In  particular  we  show  that  given  3t  4-  1  (base)  consensus  objects,  at  most  t  of  which  may 
experience  R-arbitrary  failures,  we  can  implement  a  consensus  object  that  can  only  fail 
by  R-omission.  We  prove  that  this  translation  from  R-arbitrary  to  R-omission  is  resource 
optimal. 

We  also  show  that  arbitr2iry  failures  can  be  viewed  as  having  two  orthogonal  compo¬ 
nents:  omission  and  R-arbitrary.  Specifically,  for  any  object  type  T,  given  any  f-tolerant 
self-implementations  I'  and  1"  of  T  for  omission  failures  and  R-arbitrJiry  failures  respec¬ 
tively,  we  show  how  to  construct  a  t-tolerant  self-implementation  of  T  for  arbitrary  failures. 
This  decomposition  simplifies  the  problem  of  tolerating  airbitrary  failures. 

The  paper  is  organized  as  follows.  We  give  ^ln  informal  system  model  and  define  several 
types  of  object  failures  in  Sections  2  and  3.  We  define  the  concepts  of  t-tolerant  wait-free 
implementation  and  graceful  degradation  in  Section  4.  We  provide  a  formal  presentation  of 
the  material  of  Sections  2,  3,  and  4  in  Appendices  A,  B,  and  C,  respectively.  In  Section  5. 
we  show  how  to  implement  objects  that  tolerate  responsive  failures.  We  present  t-tolerant 
implementations  of  consensus  in  Section  5.1,  of  register  in  Section  5.2,  and  of  arbitrary 
types  in  Section  5.3.  The  results  on  the  cost  of  graceful  degradation,  and  on  the  translation 
between  failure  models  Me  also  presented  in  Section  5.1.  In  Section  6,  we  study  the  fea¬ 
sibility  of  fault-toleremt  implementations  for  non-responsive  object  failures.  We  first  prove 
that  mciny  common  object  types  including  consensus  have  no  1-tolerant  implementations 
for  crash.  In  contrast,  we  show  that  register  has  a  t-tolerant  self-implementation  even 
for  arbitrary  feiilures.  We  finally  show  that  every  object  type  heis  a  t-tolereint  randomized 
implementation  from  register  even  for  eirbitrary  failures.  In  Section  7,  we  study  graceful 
degradation  for  the  R-crash  and  R-omission  failure  models.  We  present  impossibUity  re¬ 
sults  for  R-creish  and  a  universality  result  for  R-omission.  In  Section  8,  we  present  a  brief 
compjuison  with  the  results  in  [AGMT92].  In  Appendix  D,  we  define  the  object  types  that 
appear  in  this  paper. 


2  Informal  model 

A  concurrent  system  consists  of  processes  and  shared  objects.  Associated  with  each  object 
is  a  type.  The  type  characterizes  the  expected  behavior  of  the  object.  More  precisely,  an 
object  type  T  is  a  tuple  {N,  OP.  RES.  G).  where  N  is  an  integer  greater  than  one.  OP  and 
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RES  are  sets  of  operations  aind  responses  respectively,  and  G  is  a  directed  finite  or  infinite 
graph  in  which  each  edge  has  a  label  of  the  form  (op,  res)  where  op  €  OP  and  res  €  RES. 
Intuitively,  if  G  is  aji  object  of  type  T,  then  O  supports  the  operations  in  OP  and  may  be 
shaxed  by  N  processes  (we  say  T  is  £in  N -process  type).  G  specifies  the  expected  behavior 
of  O  in  the  absence  of  concurrent  operations  on  O. 

The  vertices  of  G  are  the  states  of  T.  One  state  of  T  is  the  initial  state.  A  state  s  of 
T  is  reachable  if  there  is  a  path  in  G  from  the  initial  state  to  s.  We  assume  that  every  state 
of  T  is  reachable.  A  sequence  5  ={opi,resi),{op2,res2),  . .  .,(opi,resi)  is  consistent  from  a 
state  s  of  T  if  there  is  a  path  labeled  5  in  G  from  the  state  s.  S  is  consistent  with  respect 
to  T  if  it  is  consistent  from  the  initial  state  of  T.  T  is  deterministic  if  for  every  state  s  of 
T  and  every  operation  op  €  OP,  there  is  at  most  one  edge  from  s  labeled  (op, res).  T  is 
non- deterministic  otherwise.  T  is  finite  if  G  is  finite;  T  is  infinite  otherwise. 

An  object  O  of  type  T  supports  the  set  of  procedures  Apply(P,  op,  G),  for  each  pro¬ 
cess  P  and  operation  op  in  OP(T).  A  process  P  invokes  operation  op  on  object  G  by 
caJhng  Apply(P,  op,  G),  and  executes  the  operation  by  executing  this  procedure.  The  oper¬ 
ation  completes  when  the  procedure  terminates.  The  response  for  an  operation  is  the  value 
returned  by  the  proce  lure. 

The  sequential  specification  of  an  object  G,  given  by  its  type,  is  not  sufficient  to  predict 
G’s  behavior  in  the  presence  of  concurrent  operations.  To  characterize  such  behavior,  we 
use  the  concept  of  linearizability  [HW90,  Lam86].  Roughly  speaking,  hnearizabihty  requires 
every  operation  execution  to  appear  to  take  effect  instantaneously  at  some  point  in  time 
between  its  invocation  and  response.  We  make  it  more  precise  below. 

An  execution  of  a  concurrent  system  is  an  interleaving  of  the  steps  of  the  processes 
and  the  invocations  cind  responses  of  the  objects.  Consider  an  execution  P  of  a  concurrent 
system  consisting  of  am  object  G  that  is  shamed  by  processes  Pi,  P2, . . . ,  Pat.  The  history 
of  G  in  P  is  a  set  defined  ais  follows:  (Pi,op,v,t3,te)  6  iff  in  execution  P,  process 
Pi  invokes  op  at  time  t,,  amd  this  operation  completes  at  time  t^  returning  the  response 
V.  Further,  (Pj,  op,  *,  tj,  00)  €  'H  iff  process  Pi  invokes  op  at  time  t,,  and  this  operation 
does  not  complete.  A  history  is  complete  if  it  has  no  incomplete  operations.  Given  two 
operations  (Pi,op,v,t3,te)  amd  (Pj,op',i;',t',t^)  in  a  history,  we  say  (Pi,  op,  v,  t,,te)  precedes 
(Pj,op' .v' ,t'^,tg)  if  tg  <  t'^.  A  complete  history  H  is  linearizable  with  respect  to  a  type  T  if 
there  is  a  sequencing  3  of  the  tuples  (operations)  in  Ti  such  that  3  respects  the  'precedes' 
relation,  and  is  consistent  with  respect  to  T.  A  history  Ti.  is  linearizable  with  respect  to 
a  type  T  if  a  linearizable  complete  history  7i'  cam  be  obtained  from  Ti.  ais  follows:  each 
incomplete  operation  (Pi,op,*,t,,oo)  in  Ti  is  either  removed  or  replaced  by  a  complete 
operation  (Pi,op,v,t,,te),  for  some  response  v  amd  time  t^.  This  definition  captures  the 
notion  that  some  incomplete  operations  in  Ti  had  a  “visible”  effect,  while  the  others  did 
not. 

Processes  aure  asynchronous:  i.e.,  there  are  no  bounds  on  the  relative  speeds  of  the 
processes.  Furthermore,  a  process  may  crash:  i.e.,  a  process  may  stop  at  an  arbitrary  point 
in  am  execution  amd  never  tadce  amy  steps  thereafter.  The  concept  of  wadt-freedom  wais 
introduced  to  cope  with  such  processes  (for  example,  see  [Her91]).  An  object  G  is  wait- free 
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in  an  execution  E  if  either  (i)  E  is  finite,  or  (ii)  every  operation  on  O  invoked  by  a  process 
that  does  not  crash  in  E  gets  a  response  from  O. 

An  object  O  is  correct  in  execution  E  iff  (i)  O  is  wait-free  in  E,  and  (ii)  the  history  of 
C?  in  J?  is  hne^^rizable  with  respect  to  the  type  of  O.  We  say  that  O  fails  in  E  iS  O  is  not 
correct  in  E.  Even  a  faulty  object  may  satisfy  certain  properties  which  depend  on  the  type 
of  failure  it  suffered.  We  postpone  the  definition  of  the  failure  models  to  next  section. 

Let  T  be  em  object  type  and  C  =  {Ti,T2,  ■ . .  ,Tn)  be  a  fist  of  object  types  (T,’s 
are  not  necessarily  distinct).  A  wait-free  implementation  of  T  from  £  is  a  function  I 
such  that  given  any  distinct  objects  0i,02,...,0„  of  type  Ti,T2, . . .  ,T„,  respectively, 
O  =  1(01, 02)  •  •  •  1  On)  is  an  object  of  type  T  with  the  following  property:  In  every  ex¬ 
ecution,  if  Oi,02, . . .  ,0„  are  correct,  then  O  is  correct.  We  say  O  is  a  derived  object  of 
the  implementation  I,  and  0i,02, . . .  ,0n  are  the  base  objects  of  O.  AH  implementations 
studied  in  this  paper  are  wait-free.  Hereafter  we  write  “implementation”  as  shorthand  for 
“wait-free  implementation” . 

We  define  the  terms  self-implementation  of  T  2ind  resource  complexity  a.s  in  Section 
1.3.  Our  interest  lies  not  just  in  implementations,  but  in  implementations  that  tolerate  the 
fsalures  of  base  objects.  Thus,  we  also  need  to  define  a  fault-tolerant  implementation.  We 
present  such  a  definition  in  Section  4,  after  defining  failure  models  in  Section  3. 


3  Failure  models 

An  object  is  only  an  abstraction  with  a  multitude  of  possible  implementations.  For  in¬ 
stance,  it  may  be  built  as  a  hardware  module  in  a  tightly  coupled  multi-processor  system, 
or  ais  a  server  machine  in  a  message  passing  distributed  system.  Whatever  the  implementa¬ 
tion,  the  reality  is  that  haurdware  components  sometimes  faU,  and  when  this  happens,  the 
implementation  fails  to  provide  the  intended  abstraction. 

Object  failures  lead  to  undesirable  system  behavior.  Therefore,  it  is  important  to 
implement  derived  objects  that  behave  correctly  even  if  some  of  the  base  objects  of  the 
implementation  feiil.  The  complexity  of  such  a  fault-tolerant  implementation  depends  on  the 
failure  model,  i.e.,  the  mcinner  in  which  a  failed  beise  object  departs  from  correct  behavior. 
In  this  paper,  we  define  a  spectrum  of  f^lilure  models  that  fall  into  two  broad  classes; 
responsive  and  non-responsive. 

As  we  will  see,  in  most  models  of  failure,  an  object  O  of  type  T  may  fail  by  returning  a 
response  that  is  not  allowed  by  its  type;  that  is,  a  response  not  in  RES(T).  When  a  process 
P  gets  such  a  response  from  O,  it  knows  that  O  is  faulty.  Thus,  it  is  reasonable  to  assume 
that  P  does  not  invoke  operations  on  O  thereafter.  We  restrict  our  attention  to  executions 
in  which  this  assumption  holds. 


i 


3.1  Responsive  models  of  failure 


An  object  experiencing  a  responsive  failure  responds  to  every  invocation,  even  though  the 
response  may  be  incorrect.  In  other  words,  the  object  remains  wait-free  even  after  if  it  fails. 
We  describe  below  three  increasingly  severe  models  of  responsive  failures. 

3.1.1  R-crash 

R-crash  is  the  most  benign  model  of  object  failure.  Informally,  an  object  that  fails  by  R- 
crash  behaves  correctly  until  it  f2iils,  and  once  it  fails,  it  returns  a  distinguished  response 
J_  to  every  invocation.  This  model  is  ba.sed  on  the  premise  that  an  object  detects  when  it 
becomes  faulty. 

More  precisely,  an  object  O  fails  in  execution  E  by  R-crash  iff  it  fails  in  E,  and  satisfies 
the  following  properties; 

1.  (P  is  wait-free  in  E. 

2.  Every  response  from  O  in  .E  is  either  i.  or  one  of  the  responses  allowed  by  the  type 
of  O.  An  operation  that  returns  _L  is  an  aborted  operation. 

3.  Let  H  be  the  history  of  O  in  E.  Every  operation  in  Ti.  that  is  preceded  by  an  aborted 
operation  is  itself  an  aborted  operation. 

4.  Removing  the  aborted  operations  from  H.  results  in  a  Unearizable  liistory  with  respect 
to  the  type  of  O. 

Property  3  is  the  “once  J.,  everafter  ±”  property  of  R-crash.  Property  4  models  the  re¬ 
quirement  that  O  should  behave  correctly  until  it  fails. 

3.1.2  R-omission 

Consider  an  implementation  J,  and  a  derived  object  O  of  I.  Even  if  the  beise  objects  of  O 
can  only  fail  by  R-creish,  O  itself  may  experience  a  more  severe  failure  than  R-crash.  To  see 
this,  suppose  a  base  object  b  of  O  fciils  by  R-crash.  Consider  a  process  P  that  invokes  an 
operation  op  on  O  and  executes  Apply (P,  op,  C7).  If  Apply(P,  op,  C?)  accesses  b,  b  returns  .L 
to  P.  This  may  cause  P’s  invocation  of  op  on  O  to  terminate  and  return  ±.  Now  suppose 
that  another  process  Q  later  invokes  some  operation  op'  on  O,  and  that  Apply((5,op',  O)  is 
not  required  to  access  b.  Then,  process  Q  Ccinnot  notice  the  failure  of  b.  So  Q’s  invocation  of 
op  on  O  terminates  “normally”  and  returns  a  non-X  response.  Thus,  C?’s  behavior  violates 
the  “once  X,  eversifter  X”  property  of  R-cr2ish.  Does  this  mean  that  C?’s  failure  is  arbitrary? 
We  now  argue  that  this  is  not  the  case. 

Recall  that  after  P  gets  X,  P  refrains  from  accessing  O  again.  To  Q,  this  scenario 
is  indistinguishable  from  one  in  which  P  had  crashed  in  the  middle  of  the  procedure 
Apply! P.  op,  O),  while  accessing  b.  Since  the  implementation  I  (from  which  O  is  derived) 
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is  wait-free,  O  tolerates  the  apparent  crtish  of  P.  Thus,  O’s  response  to  Q  must  be  correct. 
So,  the  failure  of  O  is  more  severe  than  R-crash,  but  is  not  completely  arbitrary.  The 
R-omission  model  captures  such  a  failure'*. 

More  precisely,  an  object  O  fails  in  execution  E  by  R-omission  iff  it  fails  in  E,  and 
satisfies  the  following  properties; 

1.  C?  is  wait- free  in  E. 

2.  Every  response  from  C?  in  E  is  either  J.  or  one  of  the  responses  allowed  by  the  type 
of  O. 

3.  Let  H  be  the  history  of  O  in  E.  Replacing  every  aborted  operation  (P,op,  ±,tg,te) 
in  by  am  incomplete  operation  {P,op,*,ts,oo)  results  in  a  hnearizable  history  with 
respect  to  the  type  of  O. 

3.1.3  R-arbitrary 

An  object  O  fails  in  execution  E  by  R-arbitrart^  iff  it  fails  in  E  and  is  wait-free  in  E.  In 
other  words,  O  responds  to  every  invocation  in  E,  but  the  history  of  O  is  not  hnearizable 
with  respect  to  the  type  of  O. 

3.2  Non-responsive  models  of  failure 

Each  responsive  model  of  failure  has  its  non-responsive  counter-part.  The  difference  hes  in 
the  fact  that  an  object  experiencing  a  non-responsive  failure  may  also  fail  to  respond  to 
invocations. 

3.2.1  Crash 

Craish  is  the  most  benign  of  all  non-responsive  models  of  failure.  Informally,  an  object 
subject  to  a  crash  failure  behaves  correctly  imtil  it  fails  (Property  1,  below),  amd  once  it 
fciils,  it  never  responds  to  any  invocations  (Property  2,  below).  More  precisely,  an  object  O 
fails  in  execution  E  by  crash  iff  it  fails  in  E,  and  satisfies  the  following  properties: 

1.  The  history  of  O  in  E  is  Uneeurizable  with  respect  to  the  type  of  O. 

2.  The  total  number  of  responses  from  C?  in  .E  is  fimte. 

^Formal  justification  for  the  R-omission  model  will  be  apparent  in  Section  7. 

’For  readability,  we  sometimes  prefer  writing  “O  experiences  an  R-arbitrary  failure  in  E". 
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3.2.2  Omission 


Omission  failures  are  more  severe  than  crash.  An  object  O  fails  in  execution  E  by  omission 
iff  it  fails  in  E,  and  the  history  of  C?  in  is  Unearizable  with  respect  to  the  type  of  O.  In 
paxticulaj,  an  object  that  faiils  by  omission  does  not  necessarily  satisfy  Property  2  of  crash 
model.  Thus,  an  object  that  fails  by  omission  may  not  respond  to  invocations  from  some 
processes,  but  respond  to  invocations  from  others  forever. 

3.2.3  Arbitrary 

The  behavior  of  an  object  that  experiences  an  arbitrary  failure  is  completely  unrestricted. 
In  particular,  such  an  object  may  not  respond  to  an  invocation,  even  if  it  does,  the  response 
may  be  arbitrary.  More  precisely,  an  object  O  fails  in  execution  E  by  arbitrary  .if  it  fciils 
in  E. 


4  Definition  of  fault-tolerant  implementations 

An  implementation  T  of  type  T  is  t- tolerant  for  failure  model  Ad  if  every  derived  object  O 
of  2  has  the  following  property;  In  every  execution,  if  at  most  t  base  obj  cts  fail,  and  they 
fciil  by  Ad,  then  O  is  correct. 

An  in-plc.nentation  2  is  gracefully  degrading  for  failure  model  M  if  every  derived  object 
O  oil  has  the  following  property;  In  every  execution,  if  aU  base  objects  that  fail,  fail  by 

then  either  O  is  correct  or  it  fails  by  Ad. 

Let  C?  be  a  derived  object  of  an  implementation  which  is  both  t-tolerant  and  gracefully 
degrading  for  failure  model  Ad.  The  above  definitions  imply  that;  (i)  if  at  most  t  base 
objects  of  O  fail  and  they  fcdl  by  Ad,  then  O  does  not  fail,  and  (ii)  if  more  than  t  base 
objects  of  O  fail,  and  they  fail  by  Ad,  then  O  may  fail,  but  it  does  not  experience  a  more 
severe  fadlure  than  Ad.  Property  (i)  is  guaranteed  by  t-tolerance.  and  property  (ii)  by 
graceful  degradation. 

Gracefully  degrading  implementations  can  be  easily  composed  as  shown  in  the  following 
lemma.  Given  a  hst  L  of  integers  and  an  integer  n,  let  MinSum{n,L)  be  the  sum  of  the  n 
smallest  integers  in  L. 

Lemma  4.1  If  a  type  T  has  a  t-tolerant  gracefully  degrading  implementation  2  from  the 
list  2i,  72, . . . ,  of  types  for  failure  model  Ad,  and  each  T)  (1  <  «  <  n)  has  a  ti- tolerant 
gracefully  degradi'^g  implementation  2,  from  Ta,  Tj2, .  •  • ,  Tij^  for  M,  then  T  has  a  t' -tolerant 
gracefully  degrading  implementation  T  fromTu,  Ti2,  •  •  •  ,7ij, ,  721,  .  ■  •  iTb;,,  . . .  ,7„i,  ....  Tnj^ 
for  Ad.  In  the  above,  t'  =  MinSum{t  +  I,(ti  +  l,t2  +  l,---iin  +  l))-l- 

Proo/ (sketch)  Define  2'(oii, . . .  ,oij, , . . . ,  o„i, . . .  ,o„^„)  =  2(0i,...,0„)  where  Oj  = 
li{o\\,oi2-  •  ■  • ,  oi;i ),  ■  •  • ,  =  2n(o„i,o„2,  •  •  •  'Onj„)-  Assume  that  each  o^i,  if  it  fads,  only 

fails  by  M.  Since  2,  is  ti-tolerant,  O.  fails  only  if  at  lecist  +  1  objects  among  o,| . oi^, 
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fail;  furthermore,  since  Ij  is  gracefully  degrading,  Oi  fails  only  by  M..  Similarly,  since  X  is 
t-tolerant,  X{0\, . . . ,  0„)  fails  only  if  at  leiist  t  +  1  objects  among  Oi, . . . ,  0„  fail.  Thus, 
for  J(Oi, . . . ,  0„)  to  fail,  at  least  MinSuTn(t  +  1,  (ti  +  1, <2  +  1,  •  •  • .  +  1))  objects  among 

Oil)  •  ■  •  1  oi;i  5  ■  •  • !  Oni,  •  •  • !  Onj„  must  fail.  In  other  words,  I'  is  a  t'-tolerant  implementation 
of  T  from  Til,  •  •  •  Tnj„ .  X'  is  gracefully  degrading  for  M  because  X  and  each  J,  (1  <  i  <  n) 
axe  gracefully  degrading  for  □ 

The  above  lemma  can  be  used  to  enh<mce  the  fault-tolerance  of  a  self-implementation. 
This  is  the  substance  of  the  next  coroUziry,  obtained  by  setting  T  =  T,  t,  =  t,  ji  =  n,  and 
Xi  —  X  in  the  lemma. 

Corollary  4.1  If  a  type  T  has  a  t-tolerant  gracefully  degrading  self-implementation  X  of 
resource  complexity  n  for  a  failure  model  M,  then  T  has  a  (t^  -I-  2t) -tolerant  gracefully 
degrading  self-implementation  X'  of  resource  complexity  n^  for  Ad . 

Recursive  application  of  the  above  corollary  boosts  the  fault-tolerance  of  self-implementations. 

Corollary  4.2  (Booster  Lemma)  If  a  type  T  has  a  1-tolerant  gracefully  degrading  self¬ 
implementation  of  resource  complexity  k  for  a  failure  model  Ai,  then  T  has  a  t-tolerant 
gracefully  degrading  self-implementation  of  resource  complexity  for  M.. 

In  Section  5.1.4,  we  illustrate  how  this  corollary  can  be  applied  to  construct  a  t-tolerant 
self-implementation  of  consensus  for  R-arbitrary  fadlures. 


5  Tolerating  responsive  failures 

Herlihy  [Her91]  Jind  Plotkin  [Plo89]  showed  that  one  can  implement  a  (wait-free)  object  of 
£my  type  using  only  consensus  and  register  objects.  Therefore,  if  consensus  and  register 
have  t-tolercint  implementations,  then  every  object  type  has  a  t-tolerant  implementation. 
Hence  we  focus  on  fault-tolerant  implementations  of  consensus  and  register. 


5.1  Fault-tolerant  implementation  of  consensus 

In  the  following,  we  first  define  the  object  type  N-consensus.  We  then  present  a  t-tolerant 
self-implementation  of  N-consensus  that  works  for  both  R-crash  and  R-omission  fciilures. 
This  implementation  requires  t  -I-  1  base  iV-consensus  objects,  and  is  thus  resource  opti¬ 
mal.  Following  that,  we  show  how  to  translate  R-arbitrary  failures  of  W-consensus  objects 
to  R-omission  failures.  Our  translation  is  abo  proved  to  be  resource  optimal.  Although 
the  above  two  results  can  be  chained  together  to  obtain  a  t-tolerant  self-implementation  of 
N-consensus  for  R-arbitr£iry  failures,  the  resultant  self-implementation  is  not  resource  effi¬ 
cient:  it  requires  0(t^)  base  consensus  objects.  We  therefore  present  an  cJternative  efficient 
self-implementation  of  resource  complexity  O(tlogt). 
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5.1.1  The  object  type  N-consensus 

N-consensus  is  an  iV-process  object  type  that  supports  two  operations,  propose  0  and 
propose  1,  ajid  ha,s  the  following  sequential  specification:  If  the  first  operation  invoked 
is  propose  v,  then  every  invocation  (including  the  first)  is  returned  the  response  v.  The 
following  two  propositions  follow  directly  from  definitions: 

Proposition  5.1  N-consensus  object  O  is  correct  in  execution  E  if  and  only  if  it  is 
wait- free  and  satisfies  the  following  three  properties  in  E: 

•  Vahdity:  If  O  returns  a  response  v,  and  v  €  {0, 1},  then  there  was  a  prior  invocation 
of  propose  v  on  O. 

•  Agreement:  If  O  returns  vi,V2  to  two  invocations,  and  vi,V2  G  {0, 1},  then  vi  =  V2- 

•  Integrity:  Every  response  of  O  is  either  0  or  1. 

An  IV-consensus  object  O  satisfies  weak  integrity  in  cin  execution  in  iff  every  response  of 
O  in  is  either  0,  1,  or  ±. 

Proposition  5.2  Let  O  be  an  N-consensus  object  that  fails  in  execution  E.  Object  O  fails 
by  R-omission  in  E  if  and  only  if  it  is  wait-free,  and  satisfies  validity,  agreement,  and  weak 
integrity  in  E. 

In  describing  our  implementations,  we  write  loc  :=  Propose(p,  u,  O)®  to  denote  that 
process  p  invokes  propose  v  on  O  and  stores  the  response  in  its  local  variable  loc. 


5.1.2  Tolerating  R-crash  and  R-omission  failures 

We  present  a  t-tolereint  self-implementation  of  N-consensus  for  R-omission  failures.  The 
resource  complexity  is  t  -f- 1,  and  is  therefore  optimal.  Since  R-omission  failures  Jire  strictly 
more  severe  than  R-creish,  this  self-implementation  also  works  for  R-crzish.  However,  it  is 
not  gracefully  degrading  either  for  R-creish  or  for  R-omission.  In  fact,  we  will  see  in  Section 
7  that  N-consensus  has  no  t-tolerant  gracefully  degrading  implementation  for  R-crash.  For 
R-omission,  however,  we  present  a  t-tolerant  gracefully  degrading  self-implementation  of 
resource  complexity  2t  +  1.  We  also  prove  that  2t  -(-  1  is  a  lower  bound  on  the  resource 
complexity.  In  fact,  this  lower  bound  apphes  to  every  “non-trivial”  deterministic  object 
type,  not  just  to  N-consonsus;  furthermore,  it  is  not  restricted  to  self-implementations. 

Theorem  5.1  Figure  1  gives  at-tolerant  self-implementation  o/N-consensus  for  R-omission 
failures.  The  resource  complexity  of  the  implementation  is  t  -I-  1  and  is  optimal. 

’ Throughout  this  paper,  we  write  Propoa*  (with  upper  case  “P”)  if  the  operation  is  on  a  derived  object, 
and  propose  (with  lower  case  “p”)  if  it  is  on  a  base  object. 
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0i,02,  ■  ■  ■  ,Ot+i  ■  N-consensus  objects 


Procedure  Propose(p,  Vp,  O)  f*  Vp  €  {0,1}  */ 
estimatCp,  w,  k  :  integer  IocjlI  to  p 
begin 

estimatCp  :=  Vp 
for  A:  :=  1  to  t  +  1  do 

w  :=  propose(p,  estimatCp,  Ok) 
if  tu  ^  -L  then  estimattp  :=  w 
return(  estimatep) 


Figure  1:  t-tolerant  self-implementation  of  N-consensus  for  R-omission 


Proof  Let  C?  be  a  derived  N-consensus  object  of  the  implementation,  and  Oi,  O2, . . . ,  Ot+i 
be  its  base  objects.  Consider  an  execution  E  in  which  at  most  t  base  objects  fail  by  R- 
omission,  and  the  remaining  objects  are  correct.  We  show  that  O  is  correct  in  E. 

1.  O  satisfies  validity:  An  easy  induction  on  k  shows  that  if  estimatep  equals  some  value 
u  at  any  point  in  E,  then  there  was  a  prior  invocation  (from  some  process  q)  of 
Propose(9,  u,  O).  The  induction  will  use  Proposition  5.2,  and  the  fact  that  p  does 
not  change  estimatep  if  a  base  object  returns  i.. 

2.  O  satisfies  agreement:  Since  at  most  t  bztse  objects  fail,  there  is  an  Oj.  (1  <  fc  <  t  -t- 1) 
that  is  correct.  So  Ok  returns  the  same  response  w  6  {0,1}  to  every  process  that 
accesses  it.  This  implies  that  for  all  p  that  access  Ok,  estimatep  =  w  whenp  completes 
the  iteration  of  the  loop.  Since  each  base  object  in  Ok+i,  ■  ■  ■ ,  Oj+i  is  either  correct 
or  fails  by  R-omission  in  E,  by  Propositions  5.1  and  5.2,  each  of  these  base  objects 
satisfies  validity.  From  these  facts,  it  is  easy  to  conclude  from  the  implementation  that 
estimatep  never  changes  vedue  from  the  {k  -f  l)st  iteration  onwards.  Thus  O  returns 
the  seune  response  w  to  every  p. 

3.  O  satisfies  integrity:  Obvious. 

Since  a  base  object  that  fails  by  R-onxission  remains  weiit-free,  it  is  cleax  that  O  is  wait-free 
in  E.  By  Proposition  5.1,  O  is  correct  in  E.  It  is  obvious  that  the  resource  complexity  of 
t  -I-  1  of  our  self-implementation  is  optimal.  □ 

The  above  (self)  implementation  is  not  gracefully  degrading.  For  instance,  suppose  that 
Vp  =  0  and  u,  =  1,  and  aU  the  t  +  I  base  objects  fail  by  R-crash  initially.  It  is  easy 
to  see  that  O  returns  0  to  p  and  1  to  q.  Thus  O  does  not  satisfy  agreement,  and  by 
Proposition  5.2,  the  failure  of  O  is  more  severe  them  R-omission.  In  fact,  we  will  now  show 
that  2t  -I-  1  is  both  a  lower  and  upper  bound  on  the  resource  complexity  of  a  t-tolerant 
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gracefully  degrading  self-implementatiou  of  N-consensus  for  R-omission^.  The  gracefully 
degrading  self-implementation  that  requires  2t  I-  1  base  objects  is  given  in  Figure  2. 


Oi ,  O2,  •  •  • ,  021+1  '■  N-consensus  objects 

Procedure  Propos0(p,  Vp,  O)  /*  Vp  £  {0,1}  */ 
Vp[1..2t  -I- 1],  estimatep,w,k:  integer  local  to  p 
begin 

1  estimatep  :=  Vp 

2  for  ^  1  to  2t  -H  1  do 

3  w  ;=  propose(p,  estimatep.  Ok) 

4  Fp[A:]  :=  w 

5  if  (w  ^  ±)A(w  7^  estimatep)  then 

6  estimatep  :=  w 

7  7p[l...(A:-l)]  :=  (±,±,...,1) 

8  if  Vp  has  more  than  i  J_ ’s  then 

9  return  (J.) 

10  else  return(  es<ima#ep) 
end 


Figure  2:  t-tolerant  gracefully  depradmp  self-implementation  of  N-consensus  for  R-omission 


Claim  5.1  For  every  k,  1  <  k  <  2t  +  1,  at  the  end  of  the  k*^  iteration  of  the  for-loop 
0/ Propose (p,  Up,  O)  in  Figure  2,  estimatep  €  {0,1},  and  Fp[l..fc]  contains  only  ± ’s  and 
estimatep ’s. 

Proof  By  an  eeisy  induction  on  fc.  □ 

Theorem  5.2  Figure  2  gives  a  t-tolerant  gracefully  degrading  self- implementation  o/N-consensus 
for  R-omission. 

Proof  Let  O  be  a  derived  N-consensus  object  of  the  implementation,  and  Oi ,  O2, - Ot+ 1 

be  its  base  objects.  Consider  an  execution  E  in  which  eiU  base  objects  that  fail,  fail  by  R- 
omission. 

1.  O  is  wait-free:  Obvious  since  base  objects  that  fail  by  R-omission  remain  wait-free. 

2.  O  satisfies  vedidity:  An  esisy  induction  on  k  shows  that  if  estimatep  equals  some  VcJue 
u  at  any  point  in  E,  then  there  wais  a  prior  invocation  (from  some  process  q)  of 
Propose(5,  u,  O).  The  induction  will  use  Proposition  5.2,  and  the  fact  that  p  does 
not  change  estimatep  if  a  base  object  returns  J.. 

^As  will  be  shown  later  in  Theorem  7.2,  there  is  no  t-tolerant  gracefully  degrading  implementation  of 
N-consensus  for  R-crash. 
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3.  O  satisfies  agreement:  Suppose,  for  a  contradiction,  there  exist  two  processes  p  and 
q  such  that  Propose(p,  Up,  (P)  returns  0  and  Propose(g,  v,,  (P)  returns  1.  From  Claim 

5.1,  and  Hues  8,  9  of  the  algorithm,  it  follows  that  Vp  has  at  least  t  +  1  0"s  at  the  end 
of  the  execution  of  Propose(p,Up,  (P)  and  Vq  has  at  least  t  +  1  I’s  at  the  end  of  the 
execution  ofPropose(g,Uq,<P).  This  is  possible- only  if  there  is  a  fc  (1  <  fc  <  2t+l)  such 
that  propose(p,  estimotep,  Ojfc)  returned  0  and  propose(g,  estimate,,  0^,)  returned  1. 
Thus  Ok  does  not  satisfy  agreement.  By  Proposition  5.2,  the  failure  of  O*.  in  E  is  not 
by  R-omission,  a  contradiction. 

4.  O  satisfies  weak  integrity:  Obvious. 

5.  O  satisfies  integrity  if  at  most  t  base  objects  fail:  Let  Ok^,Ok2^  ■  ■  ■  ,0k,  ()ti  <  k2  < 
...<  ki)  he  ail  the  correct  base  objects.  Since  at  most  t  fail,  we  have  I  >  t  +  1.  By 
Proposition  5.1,  Ojt,  satisfies  integrity  and  agreement.  Thus,  there  is  a  u  G  {0, 1}  such 
that  for  all  p,  propose (p,estimatep,Ojfe,)  returns  v.  Thus,  for  all  p,  estimatcp  =  v  at 
the  end  of  ki  iterations  of  the  for-loop  in  Propose  (p,  Vp,  O).  Using  this  and  Proposition 

5.2,  it  is  easy  to  verify  that  at  the  end  of  the  execution  of  Propose(p,  Vp,  (P),  Vp[ki]=  v 
and  estimatCp  =  v  for  all  p  and  for  all  1  <  i  <  /.  This  impHes,  by  fines  8,  9  of  the 
zdgorithm,  that  Propose(p,Up,  (P)  returns  v. 

From  1,  2,  3,  and  4  above,  and  Proposition  5.2,  we  conclude  that  either  (P  is  correct 
in  E,  or  (P  fails  by  R-omission  in  E.  From  1,  2,  3,  and  5  above,  and  Proposition  5.1,  we 
conclude  that  if  at  most  t  base  objects  of  O  fail  in  E,  O  is  correct  in  E.  Thus,  Figure  2  is 
a  t-tolerant  gracefully  degrading  self-implementation  of  N-consensus  for  R-omission.  □ 

We  now  prove  a  general  lower  bound  on  the  resource  complexity  of  gracefully  degrading 
implementations  for  R-omission.  Informally,  a  type  T  is  trivial  if  it  admits  the  following 
implementation:  there  is  a  function  /  such  that  every  Apply(P,  op,  O)  blindly  returns  f{op). 
More  precisely,  T  is  trivi^ll  if  there  is  a  function  /  :  OP{T)  —*  RES(T)  such  that  for  every 
sequence  opi,op2,  ■  ■ .  ,opk  of  operations,  (opi,  f{opi)),  {op2,  f(op2)),  ...,  {opk,f{opk))  is 
consistent  with  respect  to  T.  An  object  type  is  non-trivial  if  it  is  not  trivial.  The  following 
proposition  is  immediate  from  the  definitions. 

Proposition  5.3  Let  T  be  a  deterministic  non-trivial  object  type,  and  /o  :  OP(T)  — » 
RES(T)  be  the  function  such  that  for  all  op,  (op, /o(op))  is  consistent  with  respect  to  T.® 
Then  there  exists  a  k  >  1  and  a  sequence  opi,op2, . . .  ,opk,opk+i  of  operations  such  that 
(opi.  /o(opi)),  (op2,  /o(op2)),  ■  ■  • ,  (opk,  foiopk))  is  consistent  with  respect  to  T,  but  (opi,  /o(opi)), 
(op2,/o(op2)),  ■  ■  ■ ,  (opk,  fo(opk)) ,  {opk+i,  fo(opk+i))  is  not. 

Theorem  5.3  Let  T  be  any  deterministic  non-trivial  object  type.  The  resource  complexity 
of  any  t-tolerant  gracefully  degrading  implementation  ofT  for  R-omission  is  at  least  2t  +  l. 

Proof  Suppose  T  heis  a  t-tolerant  gracefully  degrading  implementation  I  from  some  fist 
Ti,T2, .  ■ .  ,T2t  of  object  types  for  R-omission.  Let  Oi , 02, . . . , 02t  be  base  objects  of  type 

*Note  that  fo(op)  is  the  response  of  an  object  of  type  T  when  op  is  the  first  operation  applied  to  that 
object. 
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Ti ,  T2, . . . ,  T2t,  and  let  O  =  I{0i,02,-  ■  ■ ,  02t)  be  the  corresponding  derived  object  (of  type 
T).  Let  /o  and  opi,op2, . . .  ,opk,opk+i  be  as  in  Proposition  5.3.  Consider  the  following 
scen^lrio  in  which  two  processes  P  and  Q  access  the  object  O.  At  the  start  of  the  scenario, 
object  O  is  in  the  initial  state,  and  all  its  base  objects  fail,  as  described  below. 

For  objects  O,,  1  <  i  <  t:  Whenever  P  invokes  an  operation  on  Oj,  it  returns  a  correct 
response  to  P  and  undergoes  an  appropriate  change  of  state;  but  whenever  Q  invokes  an 
operation  on  Oi,  it  returns  i.  and  does  not  undergo  any  cheinge  of  state.  For  objects  Oj, 
t  +  1  <  j  <  2t:  Whenever  P  invokes  an  operation  on  Oj,  it  returns  ±  and  does  not  undergo 
aLoy  change  of  state;  but  whenever  Q  invokes  an  operation  on  Oj,  it  returns  a  correct 
response  to  Q  and  undergoes  an  appropriate  change  of  state. 

Scenario  S 

1.  Process  Q  executes  the  sequence  opi,op2, . .  ■,opk  of  operations  on  O.  Let  vi,V2, _ Vk 

be  the  corresponding  responses. 

2.  Process  P  executes  opjt+i  on  O. 

(All  steps  in  Item  1  strictly  precede  every  step  in  Item  2).  Note  that; 

1.  The  failure  of  each  base  object  is  by  R-omission. 

2.  The  scenzirio  S  is  indistinguishable  to  Q  from  a  scenario  S'  in  which  0\,02,  ■ .  ■  ,Ot 
fail  as  above,  but  0(+i,0t+2)  •  •  •  are  correct.  Since  O  is  derived  from  a  t-tolerant 
implementation,  the  responses  to  op\,op2, . .  ■,opk  returned  by  Q  in  S'  must  be  correct. 
So  the  responses  in  S'  must  be  /o(opi)) /o(op2)»  •  •  • .  fo{opk)^  respectively.  Since  S  and 
s'  are  indistinguishable  to  Q,  Q  returns  the  same  responses  in  S. 

3.  When  P  executes  op  on  O,  the  meinner  in  which  objects  have  failed  makes  it  impossible 
for  P  to  know  whether  Q  previously  executed  any  operations  on  O.  So,  the  scenario 
S  is  indistinguishable  to  P  from  a  scenario  S"  in  which  (i)  it  is  the  first  process  to 
invoke  an  operation  on  O,  and  (ii)  only  t  base  objects,  namely  Ot+i,Ot^2)  ■  ■  ■  ^O^t- 
fail.  Since  O  is  derived  from  a  t-tolerant  implementation,  P  must  return  the  correct 
response  in  S".  So  P  must  return  /o(opfe+i)  in  S".  Since  S  is  indistinguishable  to  P 
from  S",  P  also  returns  the  response  /o(opjt+i)  in  S. 

By  Proposition  5.3,  (opi,  /o(opi)),  (op2,  M0P2)), . . . ,  {opk,  fo(opk)),  (opk+iJiopk^i))  is 
not  consistent  with  respect  to  T.  So,  the  history  of  object  O  in  the  above  scenairio  is  not 
hnearizable  with  respect  to  its  type  T.  Thus,  O  does  not  satisfy  Property  3  of  R-omission 
in  Section  3.1.2.  In  other  words,  the  failure  of  O  is  not  by  R-omission,  even  though  the 
base  objects  of  O  have  only  failed  by  R-omission.  This  implies  that  I,  the  implementation 
from  which  O  is  derived,  is  not  gracefully  degrading  for  R-omission.  □ 

5.1.3  Translation  firom  R-arbitrary  to  R-omission 

A  self-implementation  J  of  object  type  T  is  a  t-tolerant  translation  from  a  failure  model  M. 
to  a  failure  model  M'  for  T  if  every  derived  object  O  oil  satisfies  the  following  property: 
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In  every  execution  E,  if  at  most  t  base  objects  of  O  fail,  and  fail  by  M,  then  either  O  is 
correct  or  it  fails  by  A4'.  Note  that  if  no  bzise  objects  fail  in  E,  then  O  does  not  fail  either 
(this  follows  from  the  definition  of  implementation). 

In  this  section,  we  present  a  t-toler«int  translation  from  R-arbitrary  to  R-omission  for 
N-consensus.  We  also  show  that  its  resource  complexity,  3t  +  l,  is  optimal.  This  translation 
can  be  used  along  with  the  t-tolerant  self-implementation  of  N-consensus  for  R-omission 
(seen  in  Section  5.1.2)  to  obtain  a  t-tolerant  self-implementation  of  N-consensus  for  R- 
arbitrary  failures. 

Since  a  consensus  object  that  experiences  an  R-arbitrary  failure  may  return  a  non- 
binary  response,  we  always  “filter”  the  responses  to  get  a  binary  response:  procedure 
f-propose(p,t;,  C?)  returns  propose(p,  w,  C?)  if  it  is  0  or  1,  and  returns  0  otherwise. 


.A[l . . .  2t  -4-  1],  5[1 . . .  t]  ;  N-consensus  objects 


1 

2 

3 

4 

5 

6 

7 

8 
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Procedure  Propose(p,  Vp,  O) 

countp[Q..\\,  w,  i,  beliefp  :  integer  local  to  p 
begin 

Phase  1:  countp[Q..l]  :=  (0,0) 
for  i  :=  1  to  2t  -t- 1  do 

w  :=  f-propose(p,  Vp,  j4(i)) 
countp[w\  :=  countp[w\  -f  1 
Phase  2:  Choose  beliefp  such  that 

countp[beliefp]  >  countp[beliefp]. 
for  t  ;=  1  to  t  do 

ii  beliefp  ^  f-propose(p,  6e/ze/p,  B[t])  then 
retum(X) 
return(6e/ie/p) 

end 


Figure  3:  t-tolerant  translation  from  R-arbitrary  to  R-omission  for  N-consensus 


Let  C?  be  an  iV-consensus  object  derived  from  the  translation  in  Figure  3.  The  base 
objects  of  O  are  A[l . . .  2t  -t-  1],  B[1 . . .  t). 

Claim  5.2  O  satisfies  integrity  in  any  execution  in  which  all  base  objects  of  O  are  correct. 
Proof  Clear  from  the  ^llgorithm.  □ 

Claim  5.3  O  is  wait-free  in  any  execution  in  which  all  base  objects  of  O  are  wait-free. 
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Proof  Clear  from  the  algorithm.  □ 

In  the  following  claims,  let  £■  be  an  execution  in  which  at  most  t  base  objects  experience 
R-arbitrary  failures,  and  the  remaining  are  correct. 

Claim  5.4  O  satisfies  weak  integrity  in  E. 

Proof  Clear  from  the  algorithm.  □ 

Claim  5.5  O  satisfies  validity  in  E. 

Proof  Suppose  O  returns  v  G  {0,1}  to  the  invocation  Propose (p,  Up,  C?)  (from  process  p) 
Then  v  =  belie fp  (by  line  9),  and  countp[v\  =  countp[heliefp]  >  t+1  (by  line  5).  So  there  is  at 
least  one  correct  base  object  j4[i]  such  that  propose  (p,  Vp,  .A[i])  returned  v.  By  Proposition 
5.1,  i4[i]  satisfies  validity.  It  follows  that  some  process  q  invoked  propose (g,t;q,>l[z])  where 
Vq  =  V.  This  imphes  that  q  invoked  Propose(5,  u,  C?j.  □ 

Claim  5.6  O  satisfies  agreement  in  E. 

Proof  Suppose  O  fails  to  satisfy  agreement  by  returning  u  €  (0, 1}  to  some  process  p.  and 
V  to  a  different  process  q.  O  returns  u  to  p  imphes  v  —  beliefp.  .Similarly  v  =  belief^.  We 
thus  have  beliefp  7^  belie fq.  It  is  easy  to  verify  that  if  all  of  >1[1 . . .  2t  +  1]  are  correct,  then 
beliefp  =  belie  fq.  It  follows  that  at  least  one  of  i4[l . . .  2t  +  1]  fails. 

Further,  O  returns  v  to  p  imphes,  for  all  1  <  i  <  t,  propose(p, 6e/ie/p,  J3[z])  returns 
beliefp  =  t;  to  p.  Similarly,  for  ah  1  <  i  <  t,  pTopose(q, belie  fq,  B[i])  returns  belie  fq  =  v 
to  q.  Thus  all  t  base  objects  B[l . . .  t]  fail  by  not  satisfying  agreement.  Counting  the  failed 
A[i]’s  2ind  B(i]’s,  we  have  more  than  t  failed  beise  objects,  a  contradiction.  □ 

From  the  above  claims,  and  Propositions  5.1  and  5.2,  we  conclude  that:  (i)  O  is  correct 
in  every  execution  in  which  all  bjise  objects  of  O  eire  correct;  and  (h)  O  is  either  correct 
or  it  fails  by  R-omission  in  every  execution  in  which  at  most  t  base  objects  of  O  fail  by 
R-£irbitr2iry,  eind  the  remsdning  beise  objects  are  correct.  Thus, 

Theorem  5.4  Figure  3  presents  a  t-tolerant  translation  from  R-arbitrary  failures  to  R- 
omission  failures  for  N-consensus.  The  resource  complexity  of  the  translation  is  3t  +  1. 


Theorem  5.5  The  resource  complexity  of  any  translation!  from  R-arbitrary  to  R-omission 
for  N-consensus  is  at  least  3t  +  1. 

Proof  For  a  contradiction,  assume  the  resource  complexity  of  I  is  n  <  3t.  We  prove 
the  theorem  through  a  series  of  claims,  involving  “indistingmshable”  scenarios.  Let  O  = 
I(oi.02, . . .  ,On).  In  the  foUowing,  we  say  a  process  p  accesses  a  base  object  o;  if  during  the 
execution  of  Propose(p,  Vp,  C7),  p  executes  propose (p,  *,  Oj). 
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Claim  5.7  Suppose  p  executes  PTopose{p, 0,0)  to  completion.  If  all  base  objects  are  cor¬ 
rect,  then  p  accesses  at  least  <  +  1  base  objects. 

Proof  Suppose  the  clmm  is  false,  and  p  accesses  only  o,j ,  , . . . ,  (m  <  t)  before 
completing  Propose(p,  0,  C?).  Since  all  base  objects  are  correct,  O  satisfies  validity  and 
integrity.  Hence  Propose(p,  G,  O)  returns  0.  Now  consider  the  following  two  scenarios. 

Scenario  SI 


1.  p  executes  Propose(p,  0,  O)  to  completion  accessing  only  o,, .  0,2 , . . . ,  (m  <  t). 
Propose(p,  0.  O)  returns  0. 

2.  q  executes  Propose(5, 1,  O)  to  completion. 

Scenario  S2 

1.  o,, ,  Ojj , . . . ,  Oj^  fail  and  behave  as  though  they  are  accessed  by  p  exactly  as  in  scenario 
SI.  This  is  possible  since  m  <  t. 

2.  q  executes  Propose(9, 1,0)  to  completion. 

Since  no  b<ise  objects  fail  in  SI,  O  must  be  correct  in  SI.  By  Proposition  5.1,  O  satisfies 
integrity  and  agreement.  Thus  Propose(g,  1, 0)  returns  0  in  SI.  Clearly  SI  ss,  S2  (we 
write  SI  a:,  S2  to  denote  that  Scenarios  SI  and  S2  are  indistinguishable  to  process  q).  So 
Propose(9, 1, 0)  returns  0  in  S2  also,  violating  validity.  By  Propositions  5.1  and  5.2.  O  is 
neither  correct  nor  does  it  fail  by  R-omission.  Since  at  most  t  base  objects  fail  in  S2.  and 
they  fail  by  R-arbitrary.  the  translation  I  is  incorrect,  a  contradiction.  □ 

Claim  5.8  Consider 
Sceneurio  S3 

1.  p  executes  Propos6(p,  0,  O)  up  to  the  point  where  it  has  accessed  exactly  t  base  objects 

^•1  )  !  •  •  •  »  Ot(  ■ 

2.  q  executes  PToposoiq, 1,0)  to  completion. 

Then  Propose(q',  1,  (?)  returns  1. 

Proof  Let  5  =  {base  objects  accessed  by  9}  -  (o;, ,  Oi, , . . . ,  o„  }.  Let  o^,  .Oj, . be  all 

the  base  objects  in  S  arranged  in  order  of  first  invocation  of  q.  Note  that  k  <  n  -  t  <  2t. 

Let  S2'  represent  scenario  S2  when  m  =  t.  Since  at  most  t  base  objects  fciil  in  S2'. 
and  they  fail  by  R-airbitrary,  O  must  either  be  correct  or  fail  by  R-omission.  Hence,  by 
Propositions  5.1  and  5.2,  O  satisfies  validity  emd  weak  integrity  in  S2'.  So  Propose(g,  1,0) 
returns  1  or  1  in  S2'.  Since  S2'  ss,  S3,  we  conclude  Propose(g,  1,  (?)  returns  1  or  T  in  S3. 
Since  no  base  object  fails  in  S3,  O  must  be  correct.  By  Proposition  5.1,  (?  satisfies  integrity 
in  S3.  So  Propose(q,  1.  O)  returns  either  0  or  1  in  S3.  Together  with  the  above  conclusion, 
bins  imphes  the  clciim.  □ 
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Claim  5.9  Consider 
Scenario  S4 


1.  p  executes  Propose{p,  0,  O)  up  to  the  point  where  it  has  accessed  exactly  t  base  objects 

®il  5  Oij  ,  .  .  .  ,  Oj,  . 

2.  Let  Oj^,Oj^, . . .  ,Oji^  be  as  defined  above  (note  k  <  2t).  q  executes  Pxoposeiq.  1.0)  up 

to  the  point  where  it  has  accessed  exactly  {oj, ,  Ojj , . . . ,  ^ } . 

3.  p  completes  the  execution  o/Propose(p, 0,  <!?). 

Then  Propose(p,  0,  C?)  returns  0. 

Proof  Consider 
Scenario  S5 


1.  p  executes  Propose(p,  0,  O)  up  to  the  point  where  it  hjis  accessed  exactly  t  base  objects 

2.  The  baise  objects  oj^  ,Oj^, . . .  fail  and  behave  as  though  they  are  accessed  by  q 

exactly  as  in  S4. 

3.  p  completes  the  execution  of  Propose(p, 0,  C?). 

Since  k  <  2t,  the  number  of  b^lse  objects  that  fail  in  S5  =  k  -  t  <  t.  Since  they  fail 
by  R-arbitrary  in  S5,  either  O  is  correct  in  S5,  or  O  feiils  by  R-omission  in  S5.  Thus,  by 
Propositions  5.1  and  5.2,  O  satisfies  vahdity  and  weak  integrity  in  S5.  So  Propose(p. 0,  C*) 
returns  either  0  or  ±  in  S5.  Since  clearly  S4  S5,  Propose(p,  0,  C?)  returns  either  0  or  i. 
in  S4  also.  However  since  no  base  object  foils  in  S4,  O  is  correct  in  S4,  and  by  Proposition 
5.1.  it  satisfies  integrity  in  S4.  Thus  Propose(p,  0,  C?)  returns  0  in  S4.  □ 

Claim  5.10  Consider 
Scenario  S6 


1.  p  executes  Propose(p.  0,  O)  up  to  the  point  where  it  has  accessed  exactly  t  base  objects 

0,j  ,  Oij  ,  .  .  .  .  Oi,  . 

2.  q  executes  Propose(q,  1,  O)  to  completion,  returning  1,  by  Claim  5.8. 

3  Let  Oj^,Ojj - ,Oj^  be  as  defined  above  (note  k  <  2t).  , . . . ,  }  fail 

and  behave  as  though  they  are  never  accessed  by  q. 

4.  p  completes  the  execution  o/Propose(p,0,  C?). 

Then  Propose(p.  0.  O)  returns  0. 
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Proof  Note  that  S4  Sip  S6.  By  Claim  5.9,  Propose(p,  0,  C>)  returns  0  in  S4.  So  Propose(p,  0,  C?) 
returns  0  in  S6.  Q 

From  the  above  claim,  it  is  clear  that  O  does  not  satisfy  agreement  in  S6.  Hence,  by 
Propositions  5.1  and  5.2,  O  fails  in  S6,  but  not  by  R-omission.  Since  at  most  t  base  objects 
fail  in  S6,  and  they  feiil  by  R-axbitrajy,  the  translation  I  is  incorrect,  a  contradiction.  This 
completes  the  proof  of  Theorem  5.5.  n 

5.1.4  Tolerating  R-arbitrary  failures 

Since  N-consensus  has  a  t-tolerant  translation  from  R-arbitrary  to  R-omission  (of  resource 
complexity  it  +  1),  eind  hats  a  t- tolerant  self-implementation  for  R-omission  failures  (of 
resource  complexity  t-f  1),  it  follows  that  N-consensus  has  a  t-tolerant  self-implementation 
for  R-axbitraxy  failures.  However  the  resulting  self-implementation  is  expensive,  requiring 
(3t  +  l)(t  -I-  1)  base  objects.  In  this  section,  we  present  a  t-tolerant  self-implementation  for 
R-arbitrary  failures  whose  resource  complexity  is  only  O(tlogt).®  This  self-implementation 
uses  the  divide-and-conquer  strategy.  In  Figure  4,  we  present  the  base  step:  obtaining  a 
1-tolerant  self-implementation  of  resource  complexity  6.  In  Figure  6,  we  show  the  recursive 
step  of  obtaining  a  t-tolerant  self-implementation  from  a  t/2-tolercint  self- implementation. 
Consider  the  l-tolerzint  self-implementation  of  N-consensus  given  in  Figure  4: 

Claim  5.11  Let  i  be  either  1  or  4-  If  o.t  most  one  object  among  Oi,  Oj^j.  and  Oi_2 
fails,  then  Majority(p,Oi,Oi.i.i.,Oi*2iV)  returns  v  only  if  there  is  a  concurrent  or  preceding 
execution  of  Maj  ority(^,  Oi.Oi^\.  Oi~2^  v) . 

Proof  Clear  from  the  algorithm.  □ 

Claim  5.12  Let  i  be  either  1  or  4-  If  no  object  among  Oi,  Oi+i,  and  Oi^2  fails,  then,  for  all 
p  andq,  Majority(p,Oi,Oi4-i,0,>2,i'p)  returns  the  same  value  as  Majority((7. 0,,  O.vi,  Oi-2-  Vq). 


Proof  Clear  from  the  cdgorithm.  □ 

Theorem  5.6  Figure  4  gives  a  1-tolerant  self-implementation  o/N-consensus  for  R-arbitrary 
failures. 

Proof  Consider  an  execution  E  in  which  at  most  one  of  Ox,  O2,  -  • . ,  Oe  fajls  by  R-arbitrary 
and  the  remaining  are  correct.  Claim  5.11  implies  that  O  satisfies  validity  in  E.  Clearly, 
either  all  of  0i,02,  and  O3  are  correct  in  E,  or  all  of  04,05,  and  Oe  are  correct  in  E.  In 

’This  implementation,  and  all  other  implementations  for  R-arbitreny  failures  in  this  paper,  are  gracefully 
degrading.  Graceful  degradation  for  R-arbitrary  failures  is,  however,  almost  trivial  to  achieve;  it  only 
requires  that,  if  all  base  objects  are  wait-free,  then  the  derived  object  is  also  wait-free.  For  brevity,  we  omit 
references  to  graceful  degradation  in  this  section. 
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Oi  :  N-consensus  objects  (1  <  i  <  6) 

Procedure  Maj  ority(;>,  Oi,  O2, 03,  v ) 
count w:  integer  local  to  p 
begin 

countp[0..1\  ;=  (0,0) 
for  i  ;=  1  to  3  do 

w  ;=  f-propose(p,  u,  Oi) 
countp[iu]  :=  countp[w]  +  l 
if  countp[0]  >  countp[l]  then 
retum(O) 
else  return(l) 

end 

Procedure  Propose(p,  u,  O) 
begin 

V  :=  Majority(p,0i,02,03,u) 

V  :=  Majority(p,04,05,06,u) 

retum(v) 

end 


Figure  4:  1-tolerant  self-implementation  of  N-consensus  for  R-arbitrary  failures 


the  latter  ceise,  Claim  5.12  imphes  that  O  satisfies  agreement  in  E.  In  the  former  case. 
Claims  5.11  and  5.12  together  imply  that  O  satisfies  agreement  in  E.  It  is  obvious  that  O 
satisfies  integrity,  cind  is  wait-free  in  E.  Thus,  by  Proposition  5.1,  O  is  correct  in  E.  □ 

Given  this  1-tolerant  self-implementation,  by  Booster  lemma  (Corollary  4.2)  we  obtain 
a  t-tolerant  self-implementation  of  N-consensus  for  R-arbitrairy  failures.  However,  the 
resulting  resource  complexity  is  0(t'°®3®),  which  is  even  higher  than  the  complexity  of  the 
implementation  through  treinslation  mentioned  above. 

A  more  efficient  recursive  algorithm  is  presented  in  Figure  6.  This  eilgorithm  implements 
a  t-toleraint  N-consensus  object  O  from  Oi,  a  f^l-tolerant  N-consensus  object,  O2.  a 
L^J-tolerant  N-consensus  object,  and  the  following  (0-tolerant)  N-consensus  objects: 
i4o[l  . . .  3<  -f-  1],  Ai[l . . .  3t  -f  1]  and  H[1 . . .  4t  -h  1],  Figure  5  illustrates  the  order  in  wliich 
the  base  objects  of  O  ase  accessed  by  a  process  proposing  0  on  C?  (the  access  pattern  for  a 
process  proposing  1  on  C?  is  symmetric^J). 

Consider  an  execution  E  in  which  at  most  t  base  objects  fail  by  R-cirbitrary.  Since  Oi 
is  -tolerant  and  O2  is  [^J-tolerant,  either  Oi  or  O2  is  correct  in  E.  The  algorithm 
in  Figure  6  is  based  on  this  key  observation.  We  now  sketch  the  intuition  behind  Figure  6. 
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A  process  p  executing  ProposG(p,Up,  C?)  first  executes  f-propose(p,  Up,  Oi);  ifOi  seems 
correct  to  p,  p  adopts  the  value  returned  by  f-propose(p, Up.  Oi )  for  Propose(p.  Cp,  (9).  If 
p  detects  that  0\  failed,  p  uses  Oo  to  determine  the  response  for  Propose (p,  Vp.  O). 

Process  p  uses  objects  Ao[l . . .  3t  +  Ij,  Ai[l . . .  3t  +  1]  and  B[l  . .  .  4t  +  1]  to  determine 
whether  0\  fails  in  E.  0\  can  fail  in  one  of  the  following  ways:  (i)  by  returning  a  value 
outside  {0,1;,  (Li)  by  returning  a  value  v  e  {0,1}  that  was  not  proposed  by  any  process, 
and  (iii)  by  returning  0  to  some  processes  and  1  to  other  processes.  The  first  case  is 
overcome  by  using  f -propose  as  a  “filter”.  The  second  and  third  cases  are  detected  by 
using  A.„[l . . .  3t  +  1]  and  B[1 . . .  4t  +  1]  respectively. 

Note  that  the  failure  detection  provided  by  i4o[l . . .  3t  +  ij,  Aj  [l  . . .  3t  +  l]  and  B[\  . .  .  ■it  + 

1]  is  not  perfect.  O;  may  seem  correct  to  some  processes,  and  these  processes  base  their 
decision  on  O;.  Others  processes  may  detect  that  Oi  failed  and  base  their  decision  on  O2. 

The  implementation  in  Figure  6  uses  B  to  guarantee  that  both  sets  of  processes  decide  on 
the  same  value.  We  describe  the  implementation  in  Figure  6  by  sketching  how  it  overcomes 
the  different  types  of  failures  that  O;  may  exhibit: 

•  O'  returns  a  value  that  is  not  in  {0,1}.  As  before,  procedure  f-propose  “filters”  the 
response  to  ehminate  this  problem. 

•  Oi  returns  a  Vcdue  that  was  not  proposed  by  any  process.  Ao[1...3t  +  1]  and 
Ai[l . . .  3t  +  1]  are  used  to  detect  that  O;  failed,  as  follows. 

Process  p  executes  f-propose(p,  Wp,  A,,p[i]),  for  1  <  t  <  3f  +  1.  before  executing 
anslp  :=  f-propose(p.  Wp,  0; ).  It  can  be  shown  that  if  O;  is  correct  in  E.  then  aU 
correct  objects  in  Aanjip[l . . .  3t  +  1]  are  “set  ’  to  anslp.  Since  a  maximum  of  t  objects 
in  Aanjip[l . . .  3t  +  1]  may  fail  in  E,  p  expects  at  least  2t  +  1  objects  to  return  anslp 
when  p  accesses  Aa„jip[l . . .  3t  +  1].  If  p  gets  fewer  than  2t  +  1  copies  of  anslp.  p 
knows  that  0;  failed  in  E.  Thus^p  uses  O2  to  reach  the  decision  value. 

•  0\  may  return  0  to  some  processes  and  1  to  others  processes.  5[l . . .  4f  +  l]  are  used 
to  detect  that  0\  failed,  as  follows. 

Immediately  after  executing  anslp  :=  f-propcse(p,  Vp,  Oi) ,  p  executes  f -propose(p,  anslp,  B[i 
for  1  <  i  <  4t+l.  If  Oi  is  correct  in  E,  no  process  q  will  execute  f-propose(g,  anslp.  B[i] ) 
for  1  <  i  <  4t  +  1.  Thus,  all  correct  objects  in  B[1 . . .  4t  +  1]  will  be  “set”  to  anslp. 

Since  a  maximum  of  t  objects  in  B[1 . . .  4t  +  1]  may  fail  in  E,  p  expects  at  least  3t  +  1 
objects  to  return  anslp  when  p  accesses  B[1 . . .  4t  +  1].  If  p  gets  fewer  than  3f  +  1 
copies  of  anslp,  p  knows  that  Oi  failed  in  E.  Thus,  p  uses  O2  to  reach  the  decision 
value. 

If  p  detects  that  Oi  failed  in  E,  p  uses  O2  to  reach  a  decision.  Recall  that  it  is  possible 
that  some  otht  •  process  q  did  not  detect  Oi’s  failure,  hence  Propose)^,  Uq,  C>)  returned 
anslq.  In  this  case,  q  gets  at  least  3t  +  1  copies  of  ansi,  from  5[l . .  .4t  +  1],  To  ensure 
that  p  agrees  with  q  in  this  caise,  p  proposes  to  O2  the  value  v'p,  which  is  the  majority  value 
that  it  got  from  B[1 . . .  4t  +  1].  Note  that  care  is  taken  to  ensure  that  v'p  is  valid:  p  should 
have  received  at  least  t  4-  1  copies  of  v'p  when  p  accessed  A,,j_[l  .  . .  3t  4-  1].  We  now  prove: 
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Ao[l . . .  3<  4-  1],  Ai[l . . .  +  1],  B[1 . . .  4t  +  1]  ;  (0-tolerant)  N-consensus  objects 

0\  ;  [^^1-toleraait  N-consensus  object 
O2  '■  -tolerant  N-consensus  object 


Procedure  Propose(p, Up,  O) 

conntp [0..1],  WitnessCountp\fi..\\^  belief p,anslp,  ans2p,  Vp,  i,  w  :  integer  local  to  p 
begin 

1  couTitp[0..1],  WitnessCountp[0..i\  ;=  (0,0) 


2 

3 

4 


Phase  1;  for  i  :=  1  to  3t  -f  1  do 

w  :=  f-propose(p,i;p,A„p[i]) 

if  w  =  Vp  then  countp[vp\  :=  ccmntp[vp]+l 


5  Phase  2:  anslp  f-propose(p, Vp, Oi) 


6 

7 

8 


Phase  3:  for  i  :=  1  to  4<  4- 1  do 

w  :=  f-propose(p, anslp,  5[i]) 
WitnessCountp[w]  :=  WitnessCountp[w\+l 


9 

10 
11 


Phase  4:  for  i  ~  1  to  3t  4- 1  do 

w  :=  f-propose(p,Vp,A^[i]) 

if  w  =  then  countp[v^]  :=  countp[v^]+l 


12 

13 

14 

15 

16 

17 

18 
19 


Phase  5:  Choose  beliefp  such  that  WitnessCountp[beliefp]  >  WitnessCountp[beliefp\ 
if  WitnessCountp[beliefp\  >  3t  4- 1  and  countp[beliefp\  >  2t  4-  1  then 
return(6ehe/p) 

if  WitnessCountp[belief^  >  2f  4- 1  and  countp[belief^  >  f  4-  1  then 
v'p  :=  beliefp 
else  v'p  :=  Vp 

ans2p  :=  propose(p,  Vp,  O2) 
return(ans2p) 


Figure  6:  Efficient  t-tolerant  self-implementation  of  N-consensus  for  R-arbitrary  failures 
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Theorem  5.7  Figure  6  gives  at-tolerant  self-implementation  o/N-consensus  for  R- arbitrary 
failures  of  resource  complexity  0{t  log  t). 

Proof  Consider  an  execution  E  in  which  at  most  t  base  objects  fail  by  R-arbitrary.  and 
the  remaining  are  correct.  We  show  below,  through  a  series  of  claims,  that  O  is  correct  in 
E\  or  equivalently  (by  Proposition  5.1),  that  O  satisfies  validity,  agreement,  and  integrity, 
and  is  wait-free  in  E. 

Proposition  5.1  is  used  very  often  in  this  proof.  For  brevity,  we  omit  references  to  it. 
Claim  5.13  If  Oi  fails  in  E,  then  O2  is  correct  in  E. 

Proof  Suppose  both  0\  and  O2  fail  in  E.  Since  0\  is  derived  from  a  [ -tolerant 
implementation,  at  least  -I-  1  base  objects  of  Oi  must  fail  in  E.  Similarly,  at  least 

[— J  -f-  1  base  objects  of  O2  must  fail  in  E.  Thus  a  total  of  -I-  +  2  >  t  base 

objects  of  O  fail  in  E,  a  contradiction  to  the  definition  of  E.  □ 

Claim  5.14  If  0\  is  correct  in  E,  O  satisfies  validity  and  agreement  in  E. 

Proof  Suppose  Oi  is  correct.  Thus,  Oj  satisfies  validity  and  agreement.  By  the  agreement 
property  of  Oi,  anslp  =  ansi,  for  all  p,q.  (Let  v  =  anslp.)  Thus  every  process  proposes 
the  same  value  v  to  every  B[i]  in  Phase  3.  Since  at  most  t  objects  in  B[1 . .  .4t  -f  1]  fail. 
belie fp  =  v  and  WitnessCountplbeliefp]^  3t  +  1  (for  every  p). 

By  the  validity  property  of  Oi,  some  process  q  will  have  invoked  propose(g,  y. Oi) 
before  any  process  gets  the  response  v  from  Oi-  This  implies  that  q  wiU  have  finished  Phase 
1  before  any  process  begins  Phase  3.  Since  at  least  2t  -I-  1  objects  in  4t,[l  . . .  3t  -I-  1]  are 
correct,  it  follows  that  for  all  p,  countp[v]>  2t  -f- 1  by  the  end  of  Phase  4  of  p.  Thus  we  have 
WitnessCountp[beliefp]  >  3t -f  1  and  countplbeliefp]  >  2t  -t- 1  (for  every  p).  Hence  every  p 
decides  v  (the  proposal  of  q)  by  line  14.  □ 

Claim  5.15  If  Oi  fails  in  E,  O  satisfies  validity  and  agreement  in  E. 

Proof  Suppose  Oi  fails.  Then  by  Cleiim  5.13,  O2  is  correct,  and  thus,  satisfies  validity  and 
agreement.  We  need  to  consider  two  cases. 

CASE  1  Suppose  some  process  p  returns  by  Une  14.  This  imphes  that  WitnessCou7itp[beliefp\ 
>  3t  -(-  1  and  countp[beliefp]  >  2t  -t-  1.  Since  at  most  t  base  objects  fail,  it  follows  that, 
for  every  q,  WitnessCountq[beliefp]  >  2t  1  and  countq[beliefp]  >  t  +  1.  By  line  12,  this 
imphes  that  belie  fq  =  belie  fp.  Let  val  =  belie  fp.  Since  WitnessCountq[beliefq]  >  2t -f  1 
and  countq[beliefq]  >  t  -t- 1,  either  q  returns  belie  fq  =  val  by  hne  14  and  we  have  agreement 
between  p  and  q,  or  q  sets  v'q  to  belie  fq  by  hne  16,  making  v'q  equal  to  val.  Thus  every  q, 
that  does  not  return  by  line  14,  proposes  v'^  =  val  on  O2.  By  the  validity  property  of  O2, 
ans2q  =  val,  and  q  returns  val  by  hne  19.  Agsdn  we  have  agreement  between  p  and  q. 
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i4o[l . . .  +  1],  j4i[1  . . .  +  1],  B[1 . . .  4t  +  1]  :  (O-tolerant)  N-consensus  objects 

Oi  :  -toleramt  N-consensus  object 

O2  •  L^^j-tolerant  N-consensus  object 


Procedure  Propose(jj,  Up,  C?) 

countp[0..1],  WitnessCountp[0..l],  belief p, ansi p,  ans2p,  v'p,  i,  w  :  integer  local  to  p 
begin 

1  ccmn<p[0..l],  WitnessC(mntp[0..i\  ~  (0,0) 


2 

3 

4 


Phase  1;  for  i  ;=  1  to  3t  +  1  do 

w  :=  f-propose(p,Wp,A„p[i]) 

if  lu  =  Up  then  countp[v.^  ;=  countp[up]+l 


5  Phase  2:  anslp  :=  f-propose(p,  Up,  Oi) 


6 

7 
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Phase  3:  for  i  :=  1  to  4t  +  1  do 

w  :=  f-propose(p,  anslp,  jB[ij) 
WitnessCountp[w]  :=  WitnessCountp[w]+l 


9 

10 
11 


Phase  4:  for  i  :=  1  to  3t  +  1  do 

w  f-propose(p,Vp,A5;-[i]) 

if  ly  =  then  countp^T^  :=  countp[v^]+l 


12 

13 

14 

15 

16 

17 

18 
19 


Phase  5:  Choose  belie fp  such  that  WitnessCountp[beliefp]  >  WitnessCountp[beliefp] 
if  WitnessCountp[6e/ie/p]  >  3t  +  1  and  countp[6e/ze/p]  >  2t  +  1  then 
return(6e/ze/p) 

if  WitnessCountp[beliefp]  >  2t  +  1  and  conntp[6e/ie/p]  >  t  +  1  then 
v'p  ■■=  belie  fp 
else  Vp  :=  Vp 

ans2p  :=  propose(p,  Up,  O2) 
return(ans2p) 


Figure  6:  Efficient  t-tolerant  self-implementation  of  N-consensus  for  R-arbitrary  failures 
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Corollary  5.2  The  following  object  types  have  t-tolerant  self-implementations  for  R-arbitrary 
/at/ures;  (2-process)  fetch&add,  queue,  stack,  test&set,  and  (iV-process)  compare&swap , 
move,  swap. 


6  Tolerating  non-responsive  failures 

So  we  have  considered  base  objects  that  remain  responsive  (i.e.,  wait-free)  even  if  they 
fail.  Thus,  a  process  can  access  a  base  object  and  afford  to  wait  for  a  response  before 
proceeding  to  access  the  next  one.  In  other  words,  base  objects  can  be  accessed  sequentially. 
With  non-responsive  failures,  waiting  on  a  base  object  that  fails  could  block  the  process 
forever.  Hence,  to  tolerate  non-responsive  failures,  we  allow  a  process  to  access  base  objects 
“in  p2irallel”  ,  so  that  it  cjin  complete  its  operation  on  the  derived  object  even  if  some  of 
the  base  objects  fail  and  never  respond. 

As  we  will  see,  this  abUity  to  access  baise  objects  in  parallel  allows  us  to  build  t-tolerant 
implementations  of  register,  even  for  arbitrary  failures.  In  contrast,  we  show  that  N-consensus 
does  not  have  a  (deterministic)  implementation  that  tolerates  the  crash  of  a  single  base  ob¬ 
ject  even  if  we  do  not  restrict  the  number  and  the  type  of  the  bcise  objects  that  can  be 
used  in  the  implementation.  However,  randomization  circumvents  this  impossibihty  result. 
Every  object  type  has  a  t-toleramt  randomized  implementation  from  register,  even  for 
arbitrary  failures. 

The  impossibility  results  of  this  section  are  proved  by  reducing  the  consensus  problem 
[FLP85]  to  the  problem  in  question.  The  consensus  problem  for  a  system  of  N  processes  is 
defined  as  follows.  Each  process  pi  has  an  initial  binary  input  v,.  The  consensus  problem 
requires  each  correct  process  to  reach  the  same  (irrevocable)  decision  value  d  such  that 
d  e  {vi,V2,...,vn}- 

Theorem  6.1  There  is  no  1-tolerant  implementation  o/ 2-consensus  for  crash  failures. 

Proof  Suppose,  for  contradiction,  there  is  a  finite  hst  C  =  {Ti,T2, . . .  ,T/}  of  object 
types  such  that  there  is  a  1-tolerant  implementation  J  of  2-consensus  from  C  for  crash 
failures.  We  will  use  this  implementation  to  solve  the  consensus  problem  among  a  set  of 
1  +  2  processes,  one  of  which  may  crash,  in  a  system  in  which  processes  communicate  only 
through  registers. 

Consider  the  concurrent  system  5  consisting  of  /  -I-  2  processes  named  {p\ ,  P2}  U  1 1  < 

j  <  /},  and  4/  +  1  registers  named  {invocation{i,j),  response{j,i)  |1  <  i  <  2,1  <  j  < 

/}  U  {decision}.  We  claim  that  the  consensus  problem  is  solvable  in  5  even  if  one  process 
crashes.  The  following  is  the  protocol.  Let  Vi  €  {0, 1}  be  the  initial  input  of  pi.  The  basic 
idea  consists  of  two  steps: 

'^However,  we  do  not  allow  a  process  to  invoke  an  operation  on  a  base  object  if  its  previous  invocation  on 
that  object  is  still  pending. 
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1.  Use  a  set  {oi,  02, . . . ,  o/}  of  base  objects  of  type  Ti,T2, .  ■ .  ■,Ti,  and  the  implementation 
I,  to  construct  a  2-consensus  object  O  =  I{oi, . . .  ,oi)  that  tolerates  the  crash  of 
one  of  its  base  objects. 

2.  In  system  S,  process  qj  (1  <  j  <  1)  simulates  the  base  object  oj,  and  process  pi 
{i  =  1,2)  simulates  the  execution  of  Propose C?)  on  the  derived  object  O. 

The  details  are  given  below. 

Initiahze  all  4/  +  1  registers  to  ±.  Process  pi  simulates  Propose  (pi,  y,,  C?)  as  follows. 
If  Propose  (pi,  Uj,  d)  requires  p,  to  invoke  some  operation  op  on  Oj,  pi  appends  op  to  the 
contents  of  invocation{i,  j) .  If  Propose  (pj,t;j,0)  requires  p,  to  check  if  a  response  to  some 
outstzinding  invocation  on  Oj  has  arrived,  pi  checks  if  a  response  has  been  appended  by  qj 
(which  simulates  oj)  to  response{j,i).  If  Propose(pi,Vi,  O)  returns  a  value  v,  pi  first  writes 
V  in  decision  register,  and  then  decides  v.  In  addition  to  (and  concurrently  with)  the  above, 
Pi  periodically  checks  if  the  register  decision  contains  a  non-J.  value.  If  so,  it  decides  that 
vaJue. 

Process  qj  simulates  the  base  object  Oj  as  follows.  Periodically  qj  checks  the  registers 
invocation{\,  j)  ^ind  invocation{2,j),  in  a  round-robin  fashion.  If  qj  notices  that  some  op¬ 
eration  op  hzus  been  appended  to  invocation(i,j),  qj  simulates  the  application  of  op  to  Oj 
and  appends  the  corresponding  response  to  response(j,i).  In  addition  to  (and  concurrently 
with)  the  aboMe,  qj  periodically  checks  if  the  register  decision  contains  a  non-i.  value.  If  so. 
it  decides  that  value. 

The  above  simulation  protocol  solves  the  consensus  problem  among  the  I  +  2  processes 
in  the  concurrent  system  5,  even  if  one  of  them  crashes.  To  see  this,  consider  any  execution 
E  of  the  concurrent  system  S  in  which  at  most  one  process  crashes.  Let  E'  be  the  corre¬ 
sponding  “simiilated”  execution  of  the  derived  object  O.  Note  that  the  crash  of  one  process 
in  5  corresponds  to  the  crash  of  at  most  one  (simulated)  base  object  of  the  (simiilated)  de¬ 
rived  object  O  in  E' .  Since  J,  the  2-consensus  implementation  from  which  O  is  derived,  is 
1-tolerant  for  creish,  O  is  correct  in  E'  (despite  the  crash  of  one  of  its  base  objects).  Thus, 
by  Proposition  5.1,  O  satisfies  integrity,  validity,  and  agreement,  cind  is  wait-free  in  E' . 
Since  O  is  wmt-free  (in  E'),  ]ipi  does  not  creish,  Propose(pj,  Vj,  C?)  eventually  returns  some 
value  V  (in  E').  Since  O  satisfies  integrity,  u  is  a  binary  value.  Since  O  satisfies  validity,  v 
is  either  ui  or  V2-  Since  O  satisfies  agreement.  Propose (pi,  ui,  C?)  and  Propose (p2,  C>) 

never  return  different  vadues.  Thus,  from  the  protocol,  pi  and  p2  do  not  write  different 
values  in  register  decision.  Since  at  most  one  process  crashes,  at  least  one  of  pi  and  p2  will 
eventually  write  a  binary  value  v  in  register  decision.  Since  aU  correct  processes  periodically 
check  the  decision  register,  they  eventually  decide  v. 

We  showed  that  we  can  use  I  to  solve  the  consensus  problem  in  system  S,  and  this 
contradicts  the  impossibility  result  of  Louis  and  Abu-Amara  [LAA87].  □ 

We  can  strengthen  the  above  result  as  follows.  Suppose  that  at  most  one  base  object 
may  fail,  and  it  cam  only  do  so  by  being  “unfeir”  (i.e.,  by  not  responding)  to  at  most 
one  process.  Furthermore,  suppose  that  the  identity  of  this  process  is  a  priori  “common 
knowledge”  among  aiU  the  processes.  Even  with  this  extremely  weak  model  of  object  failure. 
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called  1  -unfairness  to  a  known  process,  we  can  prove  the  following: 


Theorem  6.2  There  is  no  1-tolerant  implementation  o/ 2-consensus  for  1-unfairness  to 
a  known  process. 

Proof  {sketch)  Suppose,  for  contradiction,  there  is  a  finite  list  C  =  {ri,T2, . . .  ,T/}  of 
object  types  such  that  there  is  a  l-toler«int  implementation  X  of  2-consensus  from  C  for 
1-unfairness  to,  say,  process  pi.  Consider  the  concurrent  system  S,  as  defined  in  the  proof 
of  Theorem  o.l.  Suppose  processes  in  5  run  the  same  simulation  protocol  as  in  that  proof. 
There  are  two  cases: 

1.  No  process  gjt  crzuhes.  In  this  case,  it  is  easy  to  see  that  processes  in  S  solve  the 
consensus  problem  (exactly  as  before). 

2.  Some  process  qk  crzishes.  In  this  case,  processes  in  S  may  fail  to  solve  the  consensus 
problem  for  the  following  reason.  The  crash  of  qk  corresponds  to  the  crash  of  the 
simtdated  base  object  Ok-  This  object  is  now  potentially  unfair  to  bothp\  and  p2-  But 
X  tolerates  unfairness  to  only  pi.  So  the  derived  2-consensus  object  O  of  T  is  not 
necessarily  correct. 

To  circumvent  the  problem  that  arises  in  Case  2,  we  modify  the  simulation  protocol 
as  follows:  If  Propose (p2,  U2,  C)  requires  p2  to  invoke  some  operation  op  on  some  oj,  p2 
appends  op  to  the  contents  of  invocation{2,j),  as  before,  but  now  it  also  waits  until  a 
corresponding  response  is  appended  to  re$ponse{j ,2)  by  process  qj.  The  rest  of  the  simu¬ 
lation  protocol  remains  exactly  as  before.  We  now  reconsider  the  above  two  cases  with  the 
modified  simulation  protocol: 

1.  No  process  qk  crashes.  As  before,  it  is  easy  to  see  that  processes  in  5  solve  the 
consensus  problem. 

2.  Some  process  qk  crashes.  If  p2  attempts  to  access  Ok  after  the  crash  of  qk,  it  will 
simply  wait  for  the  response  forever*^.  Therefore,  at  worst,  to  process  pi,  the  crash 
of  qk  looks  like  Ok  is  unfair  to  pi,  and  p2  is  extremely  slow.  Since  X  tolerates  the 
unfadrness  of  one  base  object  to  pi,  O  remains  correct.  Since  pi  does  not  crash  (we 
assumed  that  only  one  process  in  5  crashes,  and  this  is  qk),  Propose(pi,  uj,  O)  returns 
a  value  that  pi  writes  into  decision.  The  rest  of  the  proof  is  ^ls  in  Theorem  6.1. 

Again,  we  have  a  contradiction  to  the  impossibility  result  in  [LAA87]. 

□ 

From  the  above  two  theorems  we  have: 

'^Of  coarse,  it  also  continaes  to  read  the  decision  register  periodicalljr,  and  decides  if  a  non-X  value  is 
fonnd  there. 
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Corollary  6.1  If  type  T  implements  2-consensus,  then  there  is  no  1-tolerant  implemen¬ 
tation  ofT  for  crash  or  for  1-unfaimess  to  a  known  process. 

From  [Her91]  2md  this  coroUaxy,  we  conclude  that  compare&swap,  fetch&add,  move,  queue, 
stack,  sticky-bit,  swap,  testftset,  and  several  other  common  types  do  not  have  a  1- 
tolereint  implementation  for  crash  or  1-unfaimess  to  a  known  process.  In  contrast  to  the 
above  impossibihty  results  we  show 

Theorem  6.3  boolean  register  and  unbounded  ieg±stei  have  t -tolerant  self-implementations 
for  arbitrary  failures. 

This  follows  immediately  from  the  following  lemma  and  the  fact  that  one  can  implement 
a  multi-reader,  multi-writer  n-valued  (resp.  imboimded)  atomic  register  using  1-reader, 
1-writer,  boolean  (resp.  unbounded)  sade  registers. 

Lemma  6.1  A  t-tolerant  1-reader,  1-writer,  n-valued  (resp.  unbounded)  safe  register  can 
be  implemented  from  5t  -t- 1  1-reader,  1-writer,  n-valued  (resp.  unbounded)  safe  registers,  at 
most  t  of  which  may  experience  arbitrary  failures. 

Proof  (sketch)  Informadly,  the  reader  invokes  a  ‘read’  on  each  base  register  (the  reader 
delays  this  read  if  its  previous  read  on  the  base  register  is  still  pending) .  When  it  gets  a 
response  from  4t  -t- 1  distinct  registers,  it  returns  the  majority  value.  If  there  is  no  majority, 
it  returns  an  arbitrary  value.  To  write  a  value  v,  the  writer  invokes  a  ‘write  v'  on  each 
base  register  (again,  this  write  is  delayed  if  the  previous  write  on  the  base  register  is  still 
pending).  The  writing  completes  when  4t  -I-  1  base  registers  return  an  “ack”.  It  is  easy  to 
verify  that  the  above  scheme  implements  a  safe  register  that  is  correct  even  if  at  most  t 
beise  registers  experience  arbitrary  failtires.  □ 

R2mdomized  implementations  of  N-consensus  from  register  eire  well  known  (for  ex¬ 
ample,  see  [Asp90]).  Together  with  Theorem  6.3,  this  imphes  that  randomized  t-tolerant 
implementations  of  N-consensus  from  register  exist  for  arbitrary  feiilures.  Combining 
this  with  Theorem  6.3  and  the  universality  results  of  [Her91,  Plo89],  we  have 

Theorem  6.4  Every  finite  object  type  has  a  randomized  t-tolerant  implementation  from 
boolean  register  for  arbitrary  failures,  and  every  infinite  object  type  has  a  randomized 
t-tolerant  implementation  from  unbounded  register  for  arbitrary  failures. 

Thus,  if  a  finite  (resp.  infinite)  object  type  T  implements  boolean  register  (resp. 
imbounded  register),  then  T  has  a  randomized  t-tolerant  self-implementation  for  ar¬ 
bitrary  failures.  This  imphes  that  compare&swap,  fetchftadd,  queue,  move,  stack, 
swap,  testtset  have  t-toleremt  r2mdomized  self- implementations,  even  for  arbitrary  fail¬ 
ures! 

Our  next  result  concerns  the  nature  of  cirbitrary  failures.  It  states  that  the  problem 
of  tolerating  arbitrary  failures  can  be  reduced  to  two  strictly  simpler  problems:  tolerating 
R-arbitrary  failures  and  tolerating  omission  failures. 
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Lemma  6.2  (Decomposability  of  arbitrary  failures)  A  typeT  has  at-tolerant  self-implementation 
for  arbitrary  failures  if  and  only  ifT  has  a  t-tolerant  self-implementation  Ja  for  R-arbitrary 
failures,  and  To  for  omission  failures. 

Proof  (sketch)  The  “only  if”  direction  is  obvious.  To  prove  the  “if”  direction,  define 
^{0i,02, . . . ,  ^nm)  —  lodaioi,...,  )>**•!  la  (0(n  -ijjn+ii  It  Can  be  verified  that 

X  is  a  t-tolereint  self-implementation  of  T  for  eirbitrciry  failures.  □ 


7  Graceful  degradation  for  benign  failure  models 

We  have  seen  that  every  object  type  has  a  t-tolerant  implementation  for  R-crash  and  R- 
omission  failures.  But  what  if  we  also  require  the  implementation  to  be  gracefully  degrading? 
The  results  are  mostly  negative  for  R-crash,  but  not  so  for  R-omission. 

7.1  R-crash 

Consider  a  system  that  supports  a  given  set  5  of  “hardwaire”  objects.  Assume  that  these 
objects  may  fciil,  but  if  they  do,  they  are  guaranteed  to  only  fail  by  R-crash.  Suppose  we 
wish  to  implement  an  object  O  of  type  T  using  only  objects  in  S,  and  that  we  require  O 
to  function  correctly  only  in  the  absence  of  failures.  However,  when  objects  in  S  fail  by 
R-crash,  we  would  like  O  to  fail  only  by  R-crash.  This  last  requirement  is  desirable  for  two 
re2isons; 

•  The  benign  failure  semantics  of  R-craish  are  desirable. 

•  Such  an  object  O  appears  like  any  other  h2U’dware  object  of  the  system.  In  other 
words,  with  this  “software  implementation”  of  O,  the  system  would  be  no  different, 
in  functionality  and  fanlure  sememtics,  from  one  that  directly  supports  all  the  objects 
in  5  U  {0}  in  hardware. 

In  our  terminology,  we  are  seeking  a  gracefully  degrading  implementation  of  T  for 
R-crash  from  the  types  (of  the  objects)  in  5.  Unfortunately,  as  we  show  below,  many 
object  types  do  not  have  such  implementations,  even  from  very  powerful  object  types. 
This  negative  result  imphes  that,  in  many  cases,  the  simple  and  desirable  R-crash  failure 
semantics  cannot  be  achieved. 

An  object  type  T  is  order- sensitive  if  it  is  deterministic  eind  the  following  holds;  There 
exists  a  state  5  in  G(T),  operations  op,  op'  (not  necesscirily  distinct)  in  OP(T),  and  values 
u,v,u',v'  such  that  each  of  (op,u),{op',u')  and  {op' ,v'),(op,v)  is  consistent  from  the  state 
5  of  T,  and  u  ^  v  2ind  u'  ^  v'.  Intuitively,  when  an  object  O  is  in  the  state  5,  and 
two  processes  p  and  q  invoke  operations  op  ^lnd  op'  concurrently  on  O,  they  can,  based 
on  the  return  values,  determine  the  order  in  which  their  operations  hnearized.  queue 
is  an  example  of  an  order-sensitive  object  type.  To  see  this,  let  5  be  the  state  in  which 
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there  Me  two  elements  5  and  10  in  the  queue  (5  at  the  head),  and  let  both  op  and  op' 
be  deq.  Now  we  have  u  =  5,  u'  =  10,  v'  =  5,  and  v  =  10.  Thus  u  ^  v  and  u'  ^  v' . 
as  required,  compare&swap ,  N-consensus,  stack,  test&set  are  some  other  examples  of 
order-sensitive  object  types.  An  object  type  is  non-order-sensitive  if  it  is  deterministic  and 
not  order-sensitive.  Examples  of  non-order-sensitive  types  include  register,  sticky-bit, 
move,  and  swap. 

Theorem  7.1  There  is  no  gracefully  degrading  implementation  of  any  order-sensitive  ob¬ 
ject  type  for  R-crash  from  any  list  of  non-order-sensitive  object  types. 

Proof  Suppose  there  Me  T,  C,  and  I  such  that  T  is  an  order-sensitive  type,  £  = 
{Ti,T2,  . . .  ,T„}  is  a  hst  of  non-order-sensitive  types,  and  J  is  a  gracefully  degrading  imple¬ 
mentation  of  T  from  £  for  R-crash.  We  arrive  at  a  contradiction  after  a  scries  of  claims 
involving  bivalency  Mguments  [FLP85]  and  indistinguishable  scenarios. 

Let  O  =  I(0i,02,  ■ . .  ,On),  and  op,op',S,u,v,u',v'  be  as  given  in  the  definition  of 
an  order-sensitive  type.  Consider  the  concurrent  system  consisting  of  two  processes  p  and 
q,  and  the  shMed  object  O  (implemented  from  0i,02, . . . ,  On)-  Define  the  configuration 
(at  an  instant  t)  as  the  tuple  {Sp^Sq,  So)  where  5p,  Sq,  and  So  are  the  states  of  process  p. 
process  q,  and  object  O  respectively  (at  the  instant  t).  Let  Co  denote  the  configuration  in 
which  O  is  in  state  5,  and  p,  q  axe  about  to  execute  Apply(p,  op,  O)  and  Apply(q.  op'.  0) 
respectively. 

Claim  7.1  Suppose  all  base  objects  are  correct.  For  any  interleaving  of  the  steps  in  the 
complete  executions  of  Apply (p.  op,  O)  and  Apply (q,  op'.  O),  either  Apply (p.  op.  O)  returns 
u  and  Apply (q,  op' .O)  returns  u' ,  or  Apply {p.  op.  O)  returns  v  and  Apply (q,  op'.  O)  returns 


Proof  In  the  hnearization  of  the  execution  history  of  object  O,  cither  Apply (p.  op,  O)  imme¬ 
diately  precedes  Apply(q,  op',  O),  or  Apply(q,op'.  O)  immediately  precedes  Apply(p,  op.  O). 
This,  together  with  the  definitions  of  u,  u',  v.  v',  and  the  fact  that  T  is  a  deterministic  type, 
trivially  imply  the  claim.  □ 

Let  C  denote  a  configuration  reached  from  Co  after  some  interleaving  of  (partial)  exe¬ 
cutions  of  Apply(p,  op,  C)  and  Apply(q,  op',  C).  We  say  C  is  X -valent  if,  in  the  absence  of 
base  object  failures,  Apply(p,op,  O)  returns  X.  no  matter  how  the  steps  of  Apply(p.  op,  C) 
and  Apply(q,  op',  C)  interleave  when  execution  resumes  from  C.  By  Claim  7.1,  if  C  is  X- 
valent,  either  X  =  u  ot  X  =  v.  C  is  monovalent  if  C  is  either  u- valent  or  v-valent.  C  is 
bivalent  if  it  is  neither  u-valent  nor  u-vedent. 

Claim  7.2  Co  is  bivalent. 

^roof  Starting  from  Co,  if  p  completes  all  the  steps  of  Apply(p, op,  C)  before  q  starts 
Apply(q,op',C>),  then  Apply(p.  op,  O)  returns  u.  Thus  Co  is  not  u-valent. 
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SimilMly,  starting  from  Co,  if  9  completes  all  the  steps  of  Apply(9,  op' ,  O)  before  p  starts 
Apply(p,  op,  O),  then  Apply(qr,  op',  O)  returns  v' .  Thus,  by  Claim  7.1,  when  Apply(p,  op,  O) 
completes,  it  returns  v.  Thus  Co  is  not  u-VcJent. 

Since  Co  is  neither  u-valent  nor  u-valent,  it  is  bivalent.  □ 

We  say  C  is  a  reachable  configuration  from  C,  if,  st2irting  from  the  configuration  C. 
there  is  some  interleaving  of  the  steps  of  p  and  q  such  that  C  is  the  configuration  at  the 
end  of  that  interleaving.  Given  a  configuration  C,  let  C(p)  denote  the  configuration  that 
results  when  p  t^lkes  a  single  step  of  Apply(p,op,  C)  from  C.  C{q)  is  similarly  defined. 

Claim  7.3  There  is  a  bivalent  configuration  Ccrit  reachable  from  Co  such  that  Ccritip)  and 
CcrttiQ)  are  both  monovalent. 

Proof  Interleave  the  steps  of  Apply(p,op,  C)  and  Apply (q,  op',  C)  shown  in  Figure  7. 
Since  O  is  Wciit-free,  the  repeat. .  .until  loop  in  the  figure  must  terminate  after  a  finite  number 
of  iterations.  Let  Ccrit  be  the  value  of  C  just  when  the  loop  terminates.  It  is  easy  to  verify 
that  Ccrit  satisfies  the  properties  required  by  the  claim.  □ 


C  :=  Co 

repeat 

ifC(p)  is  bivalent  then 

C  :=  C(p) 

ifC{q)  is  bivalent  then 
C  :=  C(q) 

until  (C(p)  is  monovalent )A(C(g)  is  monovalent) 

Figme  7:  Reaching  a  crittca/ bivalent  configuration 


Since  Ccrit  is  biv^llent,  Ccritip)  and  Ccrttig)  cannot  both  be  X-valent,  for  the  same  X . 
Thus,  either  Ccrit(p}  is  u-valent  and  Ccritiq)  is  v-valent,  or  Ccritip)  is  v-valent  and  Ccrit(q) 
is  u-valent.  Without  loss  of  generality,  we  will  assume  the  former. 

Claim  7.4  The  enabled  steps  of  p  and  q  in  Ccrit  access  the  same  base  object. 

Proof  Suppose  not.  Then  (CcTit(p))(9)  and  (Ccrit{q)}(p)  are  identical  configurations,  and 
yet,  the  former  is  u-Vcilent  eind  the  latter  u-valent.  This  is  impossible  since  u  ^  v.  □ 

Assume  that  Ok  is  the  base  object  mentioned  in  the  above  claim,  and  Apply (p,  oper.  Ok). 
ApP^y(9?  oper' ,  Ok)  are  the  enabled  steps  of  p  and  q  respectively  in  Ccrit-  Since  Ok  is  an  ob¬ 
ject  of  a  non-order-sensitive  type,  either  Apply (g,  oper',  Cj.)  returns  the  same  value  whether 
applied  in  Ccrit  or  Ccritip),  or  Apply(p, oper, O^)  returns  the  same  value  whether  applied  in 
Ccrit  or  Ccrit(q).  In  the  following,  we  will  deal  with  the  former  case.  The  latter  case  can  be 
handled  similarly,  and  is  omitted. 


Claim  7.5  Consider 

Scenario  SI  (Starts  from  the  configuration  Ccrit) 


1.  Process  q  takes  the  step  kjt-ply (q,oper' ,  Ok)- 

2.  Process  p  completes  the  execution  of  kpply{p,  op,  O) . 

3.  All  base  objects  0i,02,  ■  ■  ■  ,On  fail  by  R-crash. 

4-  Process  q  resumes  and  completes  the  execution  of  kpp\y(q,op' ,0) . 

Then  kpply{p,op,0)  returns  v  and  kpplyiq,op',0)  returns  v' . 

Proof  Since  q  takes  the  step  from  Ccrit,  and  Ccritig)  is  v-valent,  and  no  bcise  object  failures 
occur  before  p  completes  the  execution  of  Apply(p,  op,  O)  in  Item  2,  Apply(p,  op,  O)  returns 
V  in  Item  2  of  the  scenario. 

Suppose  Apply(g,  op',  O)  returns  ±.  Since  I  is  gracefully  degrading,  O  must  either 
be  correct  or  fail  by  R-crash.  Given  that  Apply(p,  op,  C>)  returns  a  non-J_  response,  this 
requires  that  Apply(p,  op,  O)  precedes  Apply(9,  op',  O)  in  the  hnearization  order.  Doing  so. 
however,  impfies  that  (op,  u)  is  a  sequential  execution  from  S  consistent  with  T.  This  is 
fzJse  since  (op,u)  is  the  only  sequence  consistent  from  the  state  S  of  T,  and  v  ^  u.  Thus 
kpp\y(q,op' ,0)  cannot  return  ±. 

Suppose  Apply(5,  op',  C?)  returns  w  where  L  w  v' .  Since  in  the  hnearization.  ei¬ 
ther  Apply(p,  op,  C?)  precedes  Apply(q,op'.  O),  or  Apply(g,  op'.  C?)  precedes  Apply (p.  op.  O). 
it  follows  that  either  (op,  v),(op',  w)  or  (op',  w},(op,v)  is  a  sequential  execution  from  5  con¬ 
sistent  with  T.  This  is  fjilse  since  (op,u),(op',u')  and  (op’ ,v'),(op,v)  are  the  only  sequences 
consistent  from  the  state  5  of  T,  and  u  ^  v,  w  ^  v'  ^  v. 

We  conclude  that  Apply(g,  op',  C?)  must  return  v' .  □ 

Claim  7.6  Consider 

Scenario  S2  (St£irts  from  the  configuration  Ccrit) 

1.  Process  p  takes  the  step  Apply (p,  oper,  Ofc). 

2.  Process  q  takes  the  step  Apply (q,  oper',  O^). 

3.  Process  p  resumes  and  completes  the  execution  o/ Apply(p,  op,  (P). 

4-  All  base  objects  O1.O2, . . .  ,0„  fail  by  R-crash. 

5.  Process  q  resumes  and  completes  the  execution  of  kpply(q,  op',  O). 

Then  Apply (p,  op,  (?)  returns  u  and  Apply (g,  op',  (?)  returns  v' . 
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Proof  Since  p  takes  the  step  from  Cent,  aiid  Ccrit(p)  is  u-valent,  and  no  bcise  object  failures 
occur  before  p  completes  the  execution  of  Apply(p,  op,  O)  in  Item  3,  Apply(p,  op.O)  returns 
u  in  Item  3  of  the  scenario.  Since  Apply(g,  op',  C)  returns  v'  as  in  SI.  □ 

Neither  {op,u),{op' ,v')  nor  {op' ,v'),[op,u)  is  a  sequence  consistent  from  the  state  S  of 
T.  Hence  the  execution  in  Cl^lim  7.6  is  not  linearizable.  Thus  the  failure  of  O  in  S2  is  not 
by  R-creish.  We  conclude  that  I  is  not  a  gracefully  degrading  implementation  for  R-crash, 
a  contradiction  which  concludes  the  proof  of  Theorem  7.1.  □ 

Preserving  the  failures  semantics  of  the  underlying  system  is  a  desirable  property  of 
an  implementation.  For  R-crash,  the  above  theorem  shows  that  this  property  is  often  not 
achievable;  implementations  necessarily  amplify  the  R-crash  failures  of  base  objects.  For 
example,  consider  a  system  that  supports  registers  2ind  sticky- bits  in  "hardware”.  In  such 
a  system,  any  object  C2m  be  implemented  [Plo89],  including  (for  example)  queues.  Suppose 
we  are  given  the  following  guarantee:  if  any  of  the  given  registers  or  sticky  bits  fail,  they  fail 
only  by  R-crash.  Can  we  implement  a  queue  that  cannot  fail  more  severely  than  R-crash? 
The  above  theorem  shows  that  this  cannot  be  done. 

Requiring  a  derived  object  to  inherit  the  R-crash  semantics  of  its  base  objects  is  even 
more  difficult  if  we  add  the  requirement  that  the  derived  object  be  1-tolerant:  Even  if  we  do 
not  restrict  the  types  of  primitives  available  in  the  underlying  system,  such  implementations 
do  not  exist  for  most  objects  of  interest.  This  is  shown  by  the  theorem  below. 

Theorem  7.2  There  is  no  1-tolerant  gracefully  degrading  implementation  of  any  order- 
sensitive  object  type  for  R-crash. 

Proof  Suppose  there  are  T,  C,  and  T  such  that  T  is  an  order-sensitive  type,  C  = 
{Ti  ,T2,  . . .  ,Tn}  is  a  hst  of  types,  and  J  is  a  1-tolerant  gracefully  degrading  implementation 
of  T  from  C  for  R-crash.  We  arrive  at  a  contradiction  after  a  series  of  claims  involving 
indistinguishable  scenarios.  Let  O  =  J(Oi,  O2,  • . .  On),  and  op,  op',  S.  u,  v,  u',  v'  be  as 
given  in  the  definition  of  order-sensitive  types.  Suppose  O  is  in  state  5,  and  p,  q  are  about 
to  execute  Apply(p,  op,  O)  and  Apply(g,  op',  O)  respectively. 

Claim  7,7  Suppose  all  base  objects  are  correct.  For  any  interleaving  of  the  steps  in  the 
complete  executions  of  kpply{p,  op,  O)  and  kpply{q,op',0),  either  Xpply {p,  op,  O)  returns 
u  and  Xpply {q,  op' ,0)  returns  u' ,  or  Apply(p,  op,  O)  returns  v  and  kpplY(q,op' ,0)  returns 
v' . 

Proof  Same  ais  Cljiim  71.  □ 

Claim  7.8  There  exists  a  (possibly  empty)  sequence  a  of  steps  of  p  and  a  step  s  of  p  such 
that  the  following  Scenarios  SI  and  S2  are  possible. 

Scenario  SI  (scenario  starts  with  O  in  state  5) 

1.  Process  p  initiates  and  partially  executes  Apply(p,  op.  O)  by  completing  the  .steps  in  a. 
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2.  Process  q  initiates  and  completes  (all  the  steps  of)  Apply(g,  op',  O),  returning  v' . 

3.  p  completes  the  remaining  steps  of  kpply(p,op,0),  returning  v. 

Sceneurio  S2  (scen2irio  starts  with  O  in  state  5) 

1.  p  initiates  and  (partiedly)  executes  Apply(p, op,  (P)  by  completing  the  steps  in  a  s. 

2.  q  initiates  and  completes  (all  the  steps  of)  Apply {q,  op',  O),  returning  u' . 

3.  p  completes  the  remaining  steps  of  Apply (p,  op,  O),  returning  u. 

Proof  Clearly  if  process  p  executes  no  steps  of  Apply(p,  op,  O)  before  process  q  initiates  and 
completes  Apply (q,  op',  O),  then  Apply(q,  op' ,0)  must  return  v' .  f  urther,  if  p  initiates  .md 
completes  all  the  steps  of  Apply(p,  op,  O)  (let  0  be  this  sequenje  of  steps)  before  q  initiates 
and  completes  Apply(q,  op' ,0),  then  Apply(q,  op' ,0)  must  return  u'.  Together  with  Claim 
7.7  by  which  Apply(q,  op',  O)  must  return  either  u'  or  v' ,  the  above  imphes  that  there  exists 
a  sequence  a  of  steps  and  a  step  s  such  that  or.s  is  a  prefix  of  0  tor  which  the  claim  holds. 
□ 

Herezdter  we  will  assume  Ok  is  the  bcise  object  accessed  by  p  in  step  s. 

Claim  7.9  Consider 

Scenario  S3  (scenario  starts  with  O  in  state  5) 

1.  p  initiates  and  (particiUy)  executes  Apply(p,  op,  O)  by  completing  the  steps  in  a.s. 

2.  q  initiates  and  completes  (all  the  steps  of)  Apply(g,  op',  CP),  returning  u'  (as  in  S2). 

3.  Oi,  02)  •  •  ■ ) On  fail  by  R-crash. 

4-  p  completes  the  remaining  steps  of  Apply{p,  op,  O) . 

Then  Apply(p,  op,  O)  returns  u. 

Proof  Suppose  Apply(p,  op,  O)  returns  ±.  Since  J  is  gracefully  degrading,  O  must  either 
be  correct  or  fail  by  R-crcish.  This  requires,  given  that  Apply {q,  op',  O)  returns  a  non-J. 
response,  that  Apply(g,  op',  O)  precede  Apply(p,  op,  O)  in  the  linearization  order.  Doing  so, 
however,  implies  that  (op',u')  is  a  sequential  execution  from  S  consistent  with  T.  This  is 
false  since  u'  ^  v' ,T  is  deterministic,  and  (op',  v')  is  a  sequentied  execution  from  5  consistent 
with  T.  Thus  Apply(p,op,0)  cannot  return  ±. 

Suppose  Apply(p,  op,  O)  returns  w  where  JL  ^  w  ^  u.  Since  in  the  lii  earization,  ei¬ 
ther  Apply(p,  op,  O)  precedes  Apply {q,  op',  O)  or  Apply(9,  op',  O)  precedes  Apply(p,  op,  O), 
it  follows  that  either  {op,w),{op',u')  or  {op' ,u'),{op,w)  is  a  sequentiad  execution  from  S  con¬ 
sistent  with  T.  This  is  false  since  (op, u),(op', u')  2md  {op' ,v'),(op,v)  £ire  the  only  sequences 
consistent  from  the  state  S  of  T,  and  w  ^  u,  u'  ^  v'. 

We  conclude  that  Apply(p,  op,  O)  must  return  u.  □ 
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Claim  7.10  Consider 

Scenario  54  (scenario  starts  with  O  in  state  5) 

1.  p  inituutes  and  (partially)  executes  kpplyip,  op.  O)  by  completing  the  steps  in  a.s. 

2.  Ok  fails  by  R-crash. 

3.  q  initiates  and  completes  (aii  the  steps  of)  kpply (q,  op'.  O). 

4-  Oi,....Ok-i  and  Ok+i~  ■  ■  ■  ~On  also  fail  by  R-crash. 

5.  p  completes  the  remaining  steps  of  kpplyip.op.  O) . 

Then  &pply{p,op.O i  returns  u  and  kpply(q.op' .O)  returns  u' . 

Proof  Clearly  S4~pS3.  Therefore,  as  in  S3,  kpplylp.op,  O)  returns  u  in  S4.  Since  I  is  1- 
tole^^lnt.  and  since  only  Ok  has  fzdled  by  the  completion  of  Apply(g,op',  O).  Apply (5.  op', ) 
must  return  a  non-J_  response.  Prom  the  definitions  of  u,u',v,  v'.  it  is  easy  to  verify  that 
the  only  non-X  response  that  satisfies  Unean/ability  is  u'.  □ 


Claim  7.11  Consider 

Scenario  S5  (scenario  starts  with  O  in  state  5) 

1.  p  initiates  and  partially  executes  kpply(p.op.O)  by  completing  the  steps  in  a. 

2.  Ok  fails  by  R-crash. 

3.  q  initiates  ana  completes  (all  the  steps  of)  Apply)^,  op',  C?) . 

4-  Oi . Ok-i  and  Ok.^1.  ■  ■  ■  sOn  also  fail  by  R-crash. 

5.  p  completes  the  remaining  steps  of  kpplyip.  op,  O). 

Then  Apply(p.  op,  O)  returns  u. 

Proof  Clearly  S5;5;qS4.  Therefore  Apply(^,  op',  O)  returns  u'  as  in  S4.  By  similar  arguments 
as  in  Claim  7.9.  it  can  be  shown  that  Apply(p.  op,  C)  returns  u.  □ 

Claim  7.12  Consider 

Scenario  S6  (scenario  starts  with  O  in  state  S) 

1.  p  initiates  and  partially  executes  Apply(p,  op,  C)  by  completing  the  steps  in  a 

2.  q  initiates  and  completes  (all  the  steps  of)  Apply(q,  op',  C?) , 

3.  All  base  objects  Oi,  . . . ,  fail  by  R-crash. 
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4-  p  completes  the  remaining  steps  of  kpply (p,  op,  O) . 

Then  kpply {p,  op,  O)  returns  u,  an<f  Apply (g,  op',  C?)  returns  v’ . 

Proof  Since  S6  ssp  S5,  kpplj{p,op,0)  returns  u  as  in  S5.  Since  S6  SI,  Apply(5,  op',  C?) 
returns  t;'  as  in  SI.  □ 

Neither  (op,  u),(op', u')  nor  {op' ,v'),(op,u)  is  a  sequence  consistent  from  the  state  5  of 
T.  Hence  the  execution  in  Claim  7.12  is  not  hnearizable.  Thus  the  failure  of  O  in  S6  is  not 
by  R-crash.  We  conclude  that  I  is  not  a  gracefully  degrading  implementation  for  R-crash. 
a  contradiction  which  concludes  the  proof  of  Theorem  7.2.  □ 

The  above  discussion  rmses  some  questions  on  the  “practicality”  of  the  R-crash  model: 
Even  if  “hardware”  objects  fail  by  R-crash,  “software”  objects  usually  don’t.  The  R- 
omission  model  defined  in  this  paper  does  not  have  this  serious  hmitation.  In  fact,  for 
any  t  >  0,  every  W-process  object  type  has  a  t-tolerant  gracefully  degrading  implementation 
from  any  universal  Ust  of  types.  In  other  words,  implementations  preserving  the  R-omission 
semantics  of  the  imderlying  system  always  exist.  This  is  a  formal  justification  for  adopting 
the  R-omission  model  of  failure.  These  results  cire  presented  in  the  next  section. 

7.2  R-omission 

The  object  type  N-consensus  is  order-sensitive.  By  Theorem  7.2,  N-consensus  has  no 
t-tolerant  gracefully  degrading  implementation  for  R-crash.  In  contrast.  N-consensus  has 
such  an  implementation  for  R-omission  (Theorem  5.2  in  Section  5).  Further,  we  can  show 

Theorem  7.3  register  has  a  t~tolerant  gracefully  degrading  self-implementation  for  R- 
omission. 

Theorems  5.2  and  7.3  can  be  combined  with  the  universal  constructions  in  [Her91.  JT92] 
to  obteiin  the  following  resTilt  for  R-omission. 

A  hst  C  of  object  types  is  N -universalii  every  TV-process  object  type  has  an  implemen¬ 
tation  from  C.  An  example  of  a  iV-universcJ  Ust  is  (N-consensus  with  reset ,  register). 

Theorem  7.4  Every  N -process  object  type  has  a  t-tolerant  gracefully  degrading  implemen¬ 
tation  from  any  N -universal  list  of  object  types  for  R-omission. 


8  Related  work 

In  an  indepen  nt  work,  Afek  et  al.  consider  the  problem  of  coping  with  shared  memory 
subject  to  men  failures  [AGMT92].  Informally,  each  failure  is  modeled  as  a  faulty  write. 
The  following  lailure  m'^dels  are  considered: 
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A.  There  is  a  bound  m  on  the  total  number  of  faulty  writes. 

B.  There  is  a  bound  /  on  the  total  number  of  data  objects  that  may  be  affected  by  memory 

failures,  and  a  bound  k  on  the  number  of  faulty  writes  on  each  faulty  object.  A 
different  model  is  obtained  for  k  =  oo. 

In  our  terminology,  these  models  are  responsive.  The  second  one,  with  fc  =  oo,  corresponds 
to  our  R-arbitrary  failure  model. 

[AGMT92]  focuses  on  fault-tolerant  implementations  of  the  following  types  of  ob¬ 
jects;  safe,  atomic,  binary,  and  V-valued  register  from  various  types  of  registers;  N- 
process  test&set  from  iV-process  test&set  and  boimded  register;  and  N-consensus 
from  read-modify-write  (RMW).  [AGMT92]  also  gives  a  universal  fault-tolerant  imple¬ 
mentation  from  unbounded  RMW,  based  on  Herhhy’s  universal  implementation.  The  main 
differences  between  [AGMT92]  and  this  paper  are  as  follows: 

1.  [AGMT92]  does  not  consider  any  non-responsive  failure  model. 

2.  Amongst  the  responsive  failure  models,  benign  ones,  such  as  R-crash  cind  R-omission, 
are  cJso  not  considered  in  [AGMT92|. 

3.  This  paper  does  not  consider  models  that  bound  the  number  of  times  faulty  objects 
can  fcul  (in  [AGMT92]  each  “faulty  write”  is  counted  as  a  failure). 

4.  The  two  approaches  to  modeling  failures  are  fundamentally  different.  There  is  no 
direct  way  to  model  benign  failures,  such  as  R-crash  and  R-omission  failures,  with 
“faulty  writes”.  On  the  other  hand,  our  approach — defining  how  each  faulty  object 
deviates  from  its  type — is  not  suited  to  handle  Model  A  above. 

5.  This  paper  introduces  the  concept  of  graceful  degradation,  and  presents  several  related 
results,  in  particular,  for  R-cr^h  and  R-omission  failure  models.  For  R-arbitrary 
failures,  graceful  degradation  reduces  to  the  “strong  wait-freedom”  concept  considered 
in  [AGMT92]. 

6.  The  concept  of  fault-toleremt  self -implementation,  is  a  central  theme  of  this  paper. 
Corollary  5.1  states  sufficient  conditions  for  their  existence,  and  Corollary  5.2  hsts 
several  types  that  have  such  implementations.  In  the  Open  Problems  section  of 
[AGMT92]  it  is  stated: 

“It  would  be  particularly  interesting  to  implement  memory-fault  tolerant 
data  objects  directly  from  similar,  faulty  objects,  such  as  test-£ind-set  from 
test-and-set,  without  using  atomic  registers,  or  read-modify-write  from  read- 
modify-write,  without  using  an  unboimded  imiversal  construction.” 

It  is  interesting  to  note  that  both  of  these  types  do  have  fault-tolerant  self-implementations. 
For  bounded  RMW,  this  is  a  direct  consequence  of  Corollary  5.1.  For  iV-process  test&set. 
one  can  combine  the  fault- tolerant  implementation  of  test&set  from  test&set  and 
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boxinded  register  [AGMT92],  with  the  implementation  of  bounded  register  from 
test&set  [Jay93]. 

7.  The  existence  of  a  fault-tolerant  se//-implementation  of  consensus,  shown  in  this 
paper,  does  not  follow  from  the  results  in  [AGMT92]. 

8.  The  fault-toler^lnt  implementation  of  Af-process  test&set  from  test&set  and  bounded 
register  shown  in  [AGMT92],  does  not  follow  from  our  results  (when  N  >  2). 
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A  Formal  model 


Our  formal  model  is  based  on  I/O  Automata  [LT88].  We  use  the  model  to  make  our 
definitions  of  failure  models  (Appendix  B)  and  fault-tolerant  implementations  (Appendix 
C)  precise.  The  implementations  in  the  paper  are  described  in  the  more  intuitive  Pascal-hke 
style.  In  the  following,  we  borrow  several  definitions  from  in  [HW90,  Her91].  There  are 
however  some  differences  between  our  model  and  Herhhy’s  [Her91].  Notable  among  these 
are:  (i)  our  addition  of  aji  exphcit  “crash”  state  for  a  process,  (ii)  the  definitions  of  wait- 
freedom,  and  implementation,  (iii)  the  added  assumption  of  fairness  in  our  model,  and  (iv) 
the  definition  of  clocked  concurrent  systems. 

A.l  I/O  Automata 

An  I/O  Automaton  A  is  a  non-deterministic  automaton  with  the  following  components: 

1.  States{A)  is  a  finite/infinite  set  of  states,  including  a  distinguished  set  of  starting 
states. 

2.  In{A)  is  a  set  of  input  events. 

3.  Out{A)  is  a  set  of  output  events. 

4.  Int(A)  is  a  set  of  internal  events. 

5.  StepiA)  is  a  transition  relation  given  by  a  set  of  tuples  (s.e,s')-  where  s  and  s'  are 
states,  and  e  is  an  event.  Such  a  triple  is  called  a  step,  and  it  means  that  an  automaton 
in  state  s  cam  undergo  a  trainsition  to  state  s'  and  that  transition  is  associated  with 
event  e. 

If  (s,  e,  s')  is  a  step,  we  say  e  is  enabled  in  state  s.  I/O  Automata  (abbreviated  hereafter 
as  automata)  must  additionally  satisfy  the  requirement  that  input,  output,  and  interned 
events  are  disjoint,  and  every  input  event  is  enabled  in  every  state. 

An  execution  fragment  of  an  automaton  A  is  a  finite  sequence  sq,  ei,  si,  63,  S2 . 

or  an  infinite  sequence  sq,  ei,  si,  62,  S2,  •  •  •  of  alternating  states  and  events  such  that  (s,. 
is  a  step  of  A.  An  execution  is  ein  execution  fragment  in  which  sq  is  a  starting  state.  A 
history  fragment  of  an  automaton  is  the  subsequence  of  events  in  an  execution  fragment  of 
the  automaton.  A  history  of  an  automaton  is  the  subsequence  of  events  in  an  execution.  An 
execution  fragment  E  is  fair  if  either  E  is  finite,  or  E  is  infinite  and  every  internal  event  or 
an  output  event  that  is  enabled  in  every  state  of  a  suffix  of  E  occurs  infinitely  many  times 
in  E/* 

A  new  automaton  can  be  constructed  by  composing  a  set  of  compatible  automata.  A 
set  of  automata  are  compatible  if,  no  two  of  them  share  any  internal  or  output  events.  That 

Since  this  simple  notion  of  fairness  is  adequate  for  our  purpose,  we  do  not  need  the  general  machinery 
described  in  [LT88]  for  formulating  fairness. 
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is,  for  every  A,  5  in  the  set,  {Int{A)  U  Out{A))r\{Int{B)  U  Out{B))=  0.  A  state  of  the 
composed  automaton  5  is  a  tuple  of  the  components’  states,  and  a  starting  state  of  5  is 
the  tuple  of  the  components’  starting  states.  The  set  of  output  events  of  5,  Out{S),  is  the 
union  of  the  sets  of  output  events  of  the  component  automata.  The  set  of  internal  events 
of  5,  iTit(S),  is  the  union  of  the  sets  of  interned  events  of  the  component  automata.  The 
set  of  input  events  of  5,  7n(5),  is  IN  —  Out{S),  where  IN  is  the  union  of  the  sets  of  input 
events  of  the  component  automata.  A  triple  (s,e,  s')  is  in  Step(S)  if  and  only  if,  for  all 
the  component  automata  A,  one  of  the  following  holds;  (1)  e  is  an  event  of  A,  and  the 
projection  of  the  step  onto  A  is  in  Step{A),  or  (2)  e  is  not  an  event  of  A,  and  the  state  of 
A  in  s  cind  s'  is  the  same. 

If  77  is  a  history  of  a  composed  automaton  and  Ai,  A2, . . . ,  Ajt  are  component  automata, 
then  7f|{Ai,  A2, . . . ,  A*.}  is  the  subhistory  of  H  consisting  of  all  events  e,  where  e  is  an  event 
of  one  of  Ai,  A2, . . . ,  A^. 

A. 2  Object  type 

An  object  type  T  is  a  tuple  (N,  OP,  RES,  (?),  where  AT  is  an  integer  greater  than  one,  OP. 
RES  are  sets  of  operations  and  responses  respectively,  and  G  is  a  directed  finite  or  infinite 
graph  in  which  each  edge  htis  a  label  of  the  form  {op,  res)  where  op  6  OP  and  res  £  RES. 
Intuitively,  if  G  is  an  object  of  type  T,  then  O  supports  the  operations  in  OP  and  may  be 
shared  by  N  processes  (we  say  T  is  an  N -process  type).  G  specifies  the  expected  behavior 
of  O  in  the  absence  of  concurrent  operations  on  O. 

The  vertices  of  G  are  the  states  of  T.  One  state  of  T  is  the  initial  state.  A  state  s  of 
T  is  reachable  if  there  is  a  path  in  G  from  the  initial  state  to  s.  We  assume  that  every  state 
of  T  is  reachable.  A  sequence  5  =(opi,resi),{op2,res2),  . .  .,{opi,resi)  is  consistent  from  a 
state  s  of  T  if  there  is  a  path  labeled  S  in  G  from  the  state  s.  5  is  consistent  with  respect 
to  T  if  it  is  consistent  from  the  initial  state  of  T. 

An  object  type  T  is  total  if  for  every  state  s  of  T,  ^lnd  every  operation  op  €  OP,  there 
is  a  response  res  such  that  there  is  an  edge  labeled  (op,  res)  from  s  in  G.  AU  object  types 
studied  in  this  paper  are  assumed  to  be  total.  T  is  deterministic  if  for  every  state  s  of 
T  cind  every  operation  op  £  OP,  there  is  at  most  one  edge  from  s  labeled  (op,  res).  T  is 
non- deterministic  otherwise.  T  is  finite  if  G  is  finite;  T  is  infinite  otherwise. 

A. 3  Processes  and  objects 

An  object  is  2in  automaton  with  two  attributes;  a  unique  name  aind  a  type.  A  process  is  an 
automaton  with  a  unique  name.  A  process  automaton  P  satisfies  the  following  properties; 

1.  There  is  a  distinguished  state  CRASHED{P)  in  States{P). 

2.  The  event  crash(P)  is  in  In{P). 

3.  For  every  state  s  6  States{P),  {s,crash(P),CRASHED{P))  is  in  Steps{P). 
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4.  The  event  crashed(P)  is  in  Out{P),  and  is  enabled  in  the  state  CRASHED(P). 


5.  if  (CRASHED(P).e,  s)  is  in  Step$(P),  then  either  e  =  crashed(P),  or  e  is  an  input 
event  of  P,  and  s  =  CRASHED(P). 

The  above  conditions  capture  the  notion  that  an  adversary  can  crash  a  process  at  any 
time  by  generating  the  input  event  crash{P)  (see  2  and  3);  and  once  it  crashes,  a  process 
remains  crashed  forever  (see  5). 

A. 4  Clock 

A  clock  is  an  automaton  with  a  single  state  s,  a  single  output  event  tick,  and  a  single  step 
{s,tick,s).  It  htis  no  input  or  interned  events. 

A. 5  Concurrent  system 

A  concurrent  system  consisting  of  processes  Pi,  P2, . . . ,  Pn,  and  objects  Oi,  O2, . . . ,  is  an 
automaton  composed  from  process  automata  P] , . . . ,  and  object  automata  Oi. ,  Om 
We  denote  such  a  concurrent  system  by  (Pi,  P2, . . . ,  P„;  Oi,  O2, . . . ,  Om)-  A  clocked  concur¬ 
rent  system^^  consisting  of  Pi, ,  P„,  and  objects  Oi, . . . ,  Om  bas  an  additional  component. 

the  clock  automaton  C,  and  is  denoted  by  (Pi, . . .  ,P„;  Oi, - Om'X).  The  output  events 

of  a  process  Pj  include  invoke{Pi,op,Oj),  where  op  is  an  operation  supported  by  the  type 
of  Oj,  and  the  input  events  of  Pi  include  respond{Pi,res,Oj).  where  res  is  a  response. 
We  refer  to  the  events  invoke{Pi,op,Oj)  and  respond(Pi,res.Oj)  as  invocations  and  re¬ 
sponses  respectively.  An  object  Oj  includes  input  events  invoke{Pi,op,Oj),  and  output 
events  respond(Pi,r€s,Oj).  Process  and  object  names  are  unique,  and  no  two  automata 
^lmong  processes  and  objects  share  any  internal  or  output  events.  This  ensures  that  the 
process  and  object  automata  are  compatible,  and  therefore,  can  be  composed. 

Let  cr  be  a  sequence  of  events  or  a  sequence  of  states  and  events  (for  example,  a  can 
be  a  history  or  an  execution).  A  response  r  matches  an  invocation  i  in  cr  if  i  is  the  latest 
event  in  <7  that  precedes  r  such  that  the  process  emd  object  names  of  i  and  r  agree.  An 
operation  in  cr  is  a  pair  of  events,  an  invocation  and  its  matching  response.  A  relation 
reflecting  the  partial  “real  time”  order  of  operations  in  <„  is  defined  as  follows:  op  <„  op' 
if  the  response  of  op  precedes  the  invocation  of  op'  in  cr.  Two  operations  unrelated  by  <a 
are  Sciid  to  be  concurrent  in  o.  An  invocation  is  pending  in  cr  if  it  has  no  matching  response. 
Complete{a)  denotes  the  mcudmal  subsequence  of  cr  in  which  there  is  no  penchng  invocation. 

A  history  H  of  a  concurrent  system  <S  =  (Pi,  P2, . . . ,  Pn\0\,02,  ■  ■  ■  -  Om)  is  k-well- 
formed  if,  for  each  pair  P,,  Oj,  (H\Pi)\Oj  begins  with  an  invocation,  and  alternates  invo¬ 
cations  £Uid  matching  responses^®,  2ind  H\Pi  hM  at  most  k  pending  invocations  in  H .  The 

‘‘Clock  ensnies  that  the  system  execution  progresses,  no  matter  how  the  other  components  in  the  system 
behave.  This  simplifies  the  definition  of  wait-Ccee  implementations,  especially  wait-&ee  implementations  that 
must  tolerate  non-responsive  failures. 

“With  the  exception  of  the  last  invocation  which  may  not  have  a  matching  response 


46 


conciirrent  system  <S  is  i:-well-formed  if  every  history  of  S  is  A:-well-formed.  Intuitively,  in 
a  fc-well-formed  concurrent  system,  if  an  invocation  of  a  process  P  on  object  O  is  pending, 
then  P  may  not  issue  a  new  invocation  on  O;  however,  P  may  issue  an  invocation  on  a 
different  object  O'  as  long  as  the  number  of  pending  invocations  from  P  does  not  exceed 
k.  The  need  for  a  fc- well- formed  system,  for  k  >  I,  arises  while  'designing  implementations 
that  tolerate  non-responsive  failures  of  the  underlying  objects.  For  example,  it  is  ea&_y  to 
see  that  ainy  implementation  that  has  to  be  wait-free  in  spite  of  the  crash  of  at  most  t  un¬ 
derlying  objects  must  be  at  least  (t  -I-  l)-weU- formed.  We  assume  that  a  concurrent  system 
is  1- well- formed  unless  specifically  mentioned  otherwise. 

In  this  paper,  we  restrict  our  attention  to  only  fair  executions  of  concurrent  systems. 
Thus,  when  we  refer  to  infinite  executions  in  this  section  and  in  Sections  3  and  4.  we 
implicitly  assume  they  are  fair. 

A. 6  Linearizability 

The  behavior  of  an  object  O  in  an  execution  E.  denoted  by  B[O.E),  is  the  subsequence  of 
invocation  and  response  events  of  O  in  E. 

A  behavior  B  is  linearizable  with  respect  to  type  T  if  B  can  be  extended  to  B’  by  append¬ 
ing  zero  or  more  responses,  and  there  is  a  sequence  cr  =  invoke{Pi^,opi,0)^  respond{Pi^^  res\.0), 
invoke{Pij,op2,0),  respond{Pij.,res2-,0),  . . .,  invoke(Pii,opi,0).  respond(Pi^.Tesi,0).  such 
that: 

1.  <7  is  a  permutation  of  the  events  in  Complete{B'). 

2-  <bQ<<t- 

3.  (opi,resi),  (op2,  res2),  •  • . ,  (op;,res/)  is  consistent  with  respect  to  T. 

Informally,  extending  B  to  B'  captures  the  notion  that  some  operations  in  B  may 
have  taken  effect,  although  the  responses  have  not  appeared  yet.  The  definition  captures 
the  notion  that  processes  appear  to  interleave  at  the  granularity  of  complete  operations 
on  O  (as  is  evident  from  the  form  of  cr  2Lnd  Condition  1),  the  notion  that  this  apparent 
interleaving  respects  the  rejJ  time  order  (Condition  2)  and  the  semantics  of  the  object  type 
T  (Condition  3). 

An  object  O  is  linearizable  with  respect  to  type  T  in  a  finite  execution  E  of  a.  concurrent 
system  if  B{0,  E)  is  linearizable  with  respect  to  T. 

Object  O  is  linearizable  with  respect  to  type  T  in  em  infinite  execution  E'  of  a  concurrent 
system  if  and  only  if  it  is  linewzable  with  respect  to  T  in  every  finite  prefix  of  E. 

A. 7  Wait-freedom 

Let  £?  be  an  execution  of  a  concurrent  system.  An  object  O  is  wait-free  in  E  if  either  (i)  E 
is  finite,  or  (ii)  every  invocation  on  C?  by  a  process  that  does  not  crash  in  E  heis  a  matching 
response. 
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A. 8  Correctness 


An  object  O  is  correct  in  an  execution  E  if  one  of  the  following  holds: 

•  C?  is  is  wait-free  in  E,  and  O  is  hnearizable  with  respect  to  its  type  in  E. 

•  More  than  N{T)  distinct  processes  have  invocations  on  O  in  E. 

The  latter  condition  captures  the  notion  tha-t  an  object  need  not  exhibit  any  sane 
behavior  if  accessed  by  more  processes  than  the  object  is  intended  for. 

An  object  O  fails  in  an  execution  E  if  it  is  not  correct  in  E. 


A. 9  Implementations 

Let  Obj(T)  denote  the  universe  of  objects  whose  type  is  T.  Let  C  =  {T\,T2. . . .  ,Tn)  be  a  list 
of  object  types  (Ti’s  are  not  necessarily  distinct).  A  wait- free  implementation  of  T  from  C 
for  processes  Pi,  P2, . . . ,  Pn(T)  is  a  function  1  :  Obj(Ti)  x  Obj(T2)  x  . . .  Obj(T„)  -+  Obj(T) 
satisfying  the  following  conditions; 

1.  O  =  1(0\,02. .  ■  •  ,On),  the  automaton  of  O  has  the  structure  of  a  concurrent  sys¬ 
tem:  (Pi,  P2 . Fs[TyO\,02 . 0„),  for  some  process  automata  Pi .  P2 . . . . 

2.  P,  and  Pj  (i  ^  j)  have  no  common  events. 

3.  If  O  =  I{Oi . On),  each  input  event  invoke{Pi,op,0)  of  O  is  an  input  event  of  P,: 

each  output  event  respond{Pi,res,0)  of  O  is  an  output  event  of  Pj. 

4.  Each  output  event  crashed{Pi)  of  Pj  is  matched  with  the  input  event  crash{Fi)  of  Pj. 

5.  Let  Oi,  02)  •  ,  On  be  any  distinct  objects  of  type  Ti,  T2, . . . ,  r„,  respectively,  and  O  = 

J(Oi, . . . ,  0„).  For  every  execution  E  of  the  clocked  concurrent  system  (Pi.  Po. . . . .  P\{T)-  O:  C), 
if  Oi, O2, .  ■  ■  ,On  aje  correct  in  E,  then  O  is  also  correct  in  E. 

In  the  above,  the  Pj’s  are  called  the  front-ends,  O  =  I(Oi,  O2,  •  . ,  0„)  is  called  a 
derived  object  of  the  implementation  I,  and  0i,02, . . .  ,0n  are  called  the  base  objects  of  O. 

The  front-end  Pj  models  the  procedure  Apply  (called  by  process  Pj  to  execute  operations 
on  a  derived  object)  eJluded  to  in  the  informal  model  of  Section  2. 

Condition  1  states  that  a  derived  object  is  constituted  by  base  objects  and  access 
procedures  (front-ends). 

Condition  2  captures  the  notion  that  the  execution  of  a  step  of  the  implementation  by 
one  process  Pj  c2innot  affect  another  process  Pj. 

Condition  3  captures  the  notion  that  (i)  invoking  an  operation  on  O,  by  process  Pj 
causes  the  front-end  Pj  to  be  activated,  and  (ii)  the  value  returned  by  the  front-end  Pj  is 
the  response  of  O. 
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Condition  4  condition  captures  our  intuition  that  when  a  process  Pj  crashes,  the  front 
end  Fi  of  that  process  must  stop  executing. 

Condition  5  ensures  that  a  derived  object  behaves  correctly  when  its  base  objects  do. 

All  implementations  studied  in  this  paper  are  wait-free.  Hereafter  we  write  "imple¬ 
mentation”  as  shorthand  for  “wait-free  implementation” .  The  implementation  X  is  a  self- 
implementation  '\i  T\  =  T2  —  ■  ■  ■  =  Tn  =  T .  The  resource  complexity  of  I  is  n,  the  number 
of  base  objects  that  make  up  a  derived  object  of  the  implementation. 


B  Models  of  failure 

Failure  models  for  objects  were  explained  in  Section  3  using  the  informed  terminology  of 
Section  2.  We  present  here  the  formed  definitions  of  these  failure  models  based  on  the  formal 
model  developed  in  Appendix  A. 

The  failure  models  fall  into  two  broad  cleisses:  responsive  and  non-responsive.  As  we 
will  see,  in  most  models  of  failure,  ein  object  O  of  type  T  that  fails  may  return  a  response 
that  is  not  in  RES(T).  When  a  process  P  gets  such  a  response  from  O,  it  knows  that  O  is 
faulty.  Thus,  it  is  reasonable  to  assume  that  P  does  not  invoke  operations  on  O  thereafter. 
We  restrict  our  attention  to  executions  in  which  this  assumption  holds. 

B.l  Responsive  models  of  failure 

Responsive  failure  models  share  the  following  property:  even  an  object  that  fails  in  an 
execution  E,  is  wait-free  in  E. 

B.1.1  R-crash 

An  object  O  fails  by  R-crash  in  an  execution  P  of  a  concurrent  system  iff  it  fails  in  E,  and 
the  following  hold  in  E: 

1.  C?  is  Wcdt-free. 

2.  Every  response  from  O  either  belongs  to  RES{T)  or  is  ±  (where  ±  is  a  distinguished 
value  not  in  RES{T),  T  being  the  type  of  O). 

3.  If  op  <£;  op'  and  the  response  for  op  is  ±,  then  the  response  for  op'  is  also  L.  This  is 
the  “once  .L,  everafter  ±”  property  of  R-crawh. 

4.  Recall  B{0,  E),  the  behavior  oiO  in  E.  Let  B'  be  obtained  by  removing  cdl  operations^' 
in  B(0,E)  whose  responses  are  1.  B'  is  linearizable  with  respect  to  the  type  of  O. 
This  property  captures  the  notion  that  an  object  failing  by  R-crash  behaves  correctly 
until  it  fails. 

"Removing  an  operation  involves  removing  the  invocation  and  the  response  of  that  operation. 
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B.1.2  R-omission 


An  informal  motivation  for  this  model  can  be  found  in  Section  3.1.2,  and  a  formal  justifi¬ 
cation  in  Section  7. 

An  object  O  fails  by  R-omission  in  an  execution  £■  of  a  concurrent  system  iff  it  fails  in 
E,  and  the  following  hold  in  E: 

1.  C?  is  wait- free. 

2.  Every  response  from  O  either  belongs  to  RES(T)  or  is  ±  (where  _L  is  a  distinguished 
value  not  in  RES(T),  T  being  the  type  of  O). 

3.  Let  B'  be  obtained  from  B(0,E)  by  removing  all  response  events  that  get  J..  Then 
B'  is  hnearizable  with  respect  to  the  type  of  O. 

Property  3  captures  the  notion  that  a  failed  operation  of  P  appears  hke  an  incomplete 
operation.  Also  notice  the  subtle  difference  in  the  way  we  obtain  B'  from  B(0,E)  for  R- 
crash  and  for  R-omission.  We  urge  the  reader  to  understcind  its  imphcations  on  the  failure 
semantics  of  the  two  models. 

B.1.3  R-arbitrary 

An  object  fails  by  R-arbitrary  in  an  execution  E  of  a.  concurrent  system  iff  it  fails  in  and 
is  waut-free  in  E. 

B.2  Non-responsive  models  of  failure 

Each  responsive  model  of  failure  has  its  non-responsive  counter-part.  The  difference  is  that, 
with  non-responsive  failures,  an  object  that  fciils  in  an  execution  E  may  not  be  wait-free  in 
E. 

B.2.1  Crash 

An  object  O  fails  by  crash  in  an  execution  of  a  concurrent  system  iff  it  fciUs  in  E,  and 
the  following  hold  in  E: 

1.  B{0,  E)  is  hnearizable  with  respect  to  the  type  of  O. 

2.  The  total  number  of  responses  from  C?  in  is  finite. 

Property  2  captures  the  notion  that  ein  object  that  fails  by  crash  does  so  at  some  finite 
point  in  the  execution.  Hence  the  number  of  times  it  will  have  responded  in  that  execution 
must  be  finite. 
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B.2.2  Omission 


An  object  O  fails  by  omission  in  an  execution  £'  of  a  concurrent  system  iff  it  fails  in  E.  and 
B(0,E)  is  linearizable  with  respect  to  the  type  of  O. 

B.2.3  Arbitrary 

An  object  O  fails  by  arbitrary  in  an  execution  E  oi  &  concurrent  system  iff  it  fails  in  E. 


C  Definition  of  fault-tolerant  implementations 

An  implementation  I  of  type  T  for  processes  Pi,P2-,  •  •  • ,  Pn(T)  is  t-tolerant  for  failure  model 
M  if  every  derived  object  O  of  J  has  the  following  property:  In  every  execution  of  the 
clocked  concurrent  system  (Pj,  P2, . . . ,  Pn(T)\  if^  most  t  base  objects  of  O  fail,  and 

they  fail  by  Ad,  then  O  is  correct. 

An  implementation  T  of  type  T  for  processes  Pi,  P2, . . . ,  Pn[T)  is  gracefully  degrading 
for  failure  model  A4  if  every  derived  object  O  of  T  has  the  following  property:  In  every 
execution  of  the  clocked  concurrent  system  (Pi,  P2, . . . ,  P^jj).;  O;  C),  if  all  base  objects  of 
O  that  fail,  fail  by  Ai,  then  either  O  is  correct  or  it  fails  by  Ad. 


D  Type  definitions 

Recall  that  an  object  type  T  is  defined  (Section  2)  ais  a  tuple  (N,  OP,  RES,  G),  where  N 
is  the  number  of  processes  supported  by  an  object  O  of  type  T,  OP  is  a  set  of  operations 
supported  by  O,  RES  is  a  set  of  result  vcilues,  and  G  is  a  graph  giving  the  sequential! 
specification  of  O.  In  this  appendix,  we  specify  OP,  RES  aind  G  for  most  object  types  that 
occur  in  the  paper.  The  parameter  N  is  unspecified:  each  choice  of  N  results  in  a  different 
type.  Similarly,  in  most  caises,  the  initiaJ  state  of  G  is  not  specified.  A  new  type  results  for 
each  choice  of  am  initial  state. 
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OP  —  {compare&swap(i;i,t;2)K’i!*^2  are  booleans} 
RES  =  {0,  1} 

Object  State: 

X,  a  boolean 

compare&swap(ui,  V2) 
ii  X  =  vi  then 
X  :=V2 
return(X) 


Figure  8;  Compare&swap 


OP  =  {resetO}  U  {propose(t;)|t;  €  {0, 1}} 
RES  =  {0, 1,  acA:} 

Object  State: 

X  €  {0,1,  ±},  initicilly  ± 

propose(u) 

if  X  =  ±  then 
X  :=  t; 
return(X) 

reset() 

X  ;=  ± 
return  (acA:) 


Figure  9:  Cons«nsus-with-reset 
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OP  —  {f  etch&add(y)|y  is  an  integer} 
RES  =  Set  of  integers 
Object  State: 

X ,  an  integer 

f etch&add(y) 

X  :=  X  +  v 
vetnin(X) 


Figure  10:  Fetch&add 


OP  =  {enq(u)|y  is  integer}  U  {deqO} 
RES  =  {y|  V  is  integer}  U  {nil,  ack} 
Object  State: 

X ,  a  sequence  of  integers 

enq(y) 

X  :=  X  -v 
returnjacA:) 

deqO 

if  X  IS  empty  then 
return  (ni/) 

else  if  X  =  V  ■  X'  then 
X  :=  X' 
return} y) 


Figure  11:  Queue 
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OP  =  {read(z),write(v,i),move(2)|?;,i  G  {0,1}} 
RES  =  (0, 1,  acA:} 

Object  State: 

Xo,Xi  G  (0, 1} 

read(i) 

if  i  =  0  then 
return  (Jfo) 
else  return(Xi) 

write(i;,  i) 

if  z  =  0  then 
Xo  :=  V 
else  Xi  :=  v 
Tet\ira{ack) 

move(z) 

X-  :=  Xi 

return  (acA:) 


Figure  12:  Move 


OP  —  {write(u)l  v  is  integer}  U  {read()} 
RES  —  (uj  V  is  integer}  U  {ack} 

Object  State: 

X ,  an  integer 

read() 

return  (X) 

write(?;) 

X  :=  i; 
retum(acfc) 


Figure  13:  (Unbotuided)  Register 
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OP  =  {push(u)|u  is  integer}  U  {pop()} 
RES  =  {z;|  i;  is  integer}  U  {nil,  ack} 
Object  State: 

X,  a  sequence  of  integers 

push(t;) 

X  :=  X-v 
return  (acA:) 

PopO 

if  X  is  empty  then 
return  (ni/) 

else  {{  X  —  X'  ■  V  then 
X  :=  X' 
return(u) 


Figure  14:  Stack 


OP  =  {write(v)|v  €  {0, 1}}  U  {readO} 
RES  =  {0, 1,  ack{ 

Object  State: 

X  €  {0,1.  ±}:  initiciUy  ± 

read() 

return  (Jf) 

write(u) 

if  X  =  i.  then 
X  ■.=  v 
return  (acA:) 


Figure  15:  Sticky-bit 
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OP  =  {read(i),write(i;,z),swap()|v,i  G  {0,1}}} 

RES  =  {0, 1,  acA:} 

Object  State; 

Xo,Xi  €  (0, 1} 

read(i) 

if  z  =  0  then 
retiirn(jLo) 
else  return(Jfi) 

write(v,z) 

if  z  =  0  then 
Xo  :=v 
else  :=  v 
TetvLin(ack) 

swapO 

temp  =  JCo 
Xo  :=  Xi 
Xi  :=  temp 
return  (acA:) 


Figure  16:  Swap 
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OP  =  {test&setO,  reset()} 
RES  =  {0,l,acJfc} 

Object  State: 

xe{Q,i} 

testftset() 

y:=X 
X  :=0 
retumiy) 

reset() 

X  ;=  1 
retiim(acA:) 


Figure  17:  Test&set 
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